DevOps Engineer

Vetric · senior

0/3 labs0/3 tests0/3 designs

Hands-on labs

3 scenarios

Bash: EKS Fleet Health Reportereasy

Write a Bash tool that audits a Kubernetes cluster and produces a health report suitable for a first-DevOps-hire onboarding audit. No cluster access needed — the lab ships canned kubectl JSON fixtures you parse with jq.

Enter lab →

Terragrunt: Multi-Env EKS Modulemedium

Design a Terragrunt layout that stamps out a shared EKS module across dev/staging/prod without copy-pasted HCL. Uses OpenTofu + LocalStack-compatible AWS provider so `tofu plan` works offline.

Enter lab →

KEDA: Scale Scrapers on Queue Depthhard

Deploy a toy scraper workload in your k3s namespace that scales 0→N based on queue depth, mirroring Vetric's event-driven data-pipeline shape. Uses Prometheus + KEDA ScaledObject.

Enter lab →

Want more to practice?

Appends 3 new labs (1 easy · 1 medium · 1 hard) and 3 new design scenarios to this briefing — existing scores stay put.

ABCD tests

3 difficulty tiers · multiple choice

easyeasy exam

10 questions

Take test →

mediummedium exam

10 questions

Take test →

hardhard exam

10 questions

Take test →

Design interviewsBeta

3 whiteboard scenarios

medium

Design the observability platform for Vetric's first real infra buildout: ~200 microservices scraping public web data, 20+ countries of customers, one small team on-call. What do metrics, logs, and traces look like end-to-end?

Directly matches the JD's monitoring stack list and the 'first DevOps hire builds the foundation' framing.

Open board →

hard

Design the ingest pipeline for a new scraper tier that collects 200k public posts/sec across 20+ countries, feeding cybersecurity and public-safety customers who need ≤5min end-to-end latency.

Vetric's entire business is high-volume public-data ingest — this is the core architectural surface the hire will own.

Open board →

hard

Vetric is bootstrapped-profitable and your AWS bill just hit a number leadership doesn't love. Redesign the EKS + scraping footprint to cut 40% of cloud spend without reducing throughput or SLA.

'Profitable from day one, fully bootstrapped' is the loudest culture signal in the JD — cost discipline is part of the job.

Open board →

Stack

9 mentioned · 2 inferred

AWS (EKS, EC2, IAM, VPC)KubernetesTerraform / OpenTofu / TerragruntGitHub Actions / Jenkins / ArgoCDPrometheus / Grafana / ELK / CloudWatchBash / Python / JavaScriptAmazon ECSKEDA / event-driven autoscalingGitHubKafka or similar streamingHelm / Kustomize

Culture

· Bootstrapped and profitable — expect cost-consciousness and long-term thinking over hypergrowth spend.
· First DevOps hire with full technical authority — candidate must be opinionated and self-directed.
· Small sharp team, infrastructure-matters tone — engineering discipline valued over ticket-churn.
· Mission-critical customers (cybersecurity, public safety) — reliability and uptime carry real weight.
· Global customer base across 20+ countries — implies 24/7 reliability expectations and likely on-call.
· In-office Tel Aviv role (not flagged remote) — assume hybrid/on-site collaboration.

From the bank

3 for this stack

Write a Bash one-liner that tails every pod's logs in a namespace and highlights any line containing 'ERROR' or '5xx'.
Tell me about a time a deploy went wrong in production. What was the blast radius, how did you recover, and what did you change afterwards?
A customer reports that one of our internal Kubernetes-hosted services is returning 502s intermittently. Walk me through how you'd investigate, starting from zero context.

Browse all →

Original job description

DevOps Engineer
Engineering Tel Aviv, IsraelSeniorFull-time
Description
What is Vetric?

Vetric builds large-scale public data infrastructure.

We provide data pipelines that collect, structure, and deliver high-volume public web data for mission-critical companies operating in cybersecurity, public safety and digital risk protection.

Our systems power platforms that detect bad actors, uncover impersonation and fraud, identify coordinated manipulation, and help public safety organizations respond faster to real-world risks.

We don’t build dashboards, and we don’t sell surface-level insights.

We build stable, production-grade data flows that become part of our customers’ core products, with the real impact of saving lives or huge known organizations from bad actors.

Operating globally, we serve industry leaders across more than 20 countries who rely on us for scale, reliability, and depth.



Why Vetric?

Vetric is profitable from day one (fully bootstrapped - we haven’t raised external funding), and we’re building foundational technology - not chasing trends. Because this is infrastructure that matters, we operate with engineering discipline, strong ownership, and long-term thinking.

We’re at a true inflection point: the team is now large enough to require real infrastructure, yet still small enough that what you build will define how things work for the next several years.

This is infrastructure that matters and so is how we operate internally. You’ll be working with a sharp, focused team that takes engineering discipline seriously and is intentionally building an organization that matches the quality of its product.



Position Overview

We are seeking a DevOps Engineer to lead and own the entire DevOps function at Vetric. 

As our first DevOps hire, you won’t just maintain systems, you will set the vision, establish best practices, and build the foundation of our infrastructure strategy for years to come. This is a unique opportunity to step into an impactful role with full technical authority, influencing architectural decisions and guiding how our engineering teams deliver, scale, and secure our large-scale, data-intensive platform.



Responsibilities:

Define and drive Vetric’s infrastructure strategy across all environments
Architect and operate Kubernetes clusters at production scale with a focus on reliability, resilience, and data-heavy workloads
Lead the adoption of Infrastructure as Code (Terraform, OpenTofu, Terragrunt) and establish automation standards
Implement modern CI/CD pipelines (GitHub Actions, Jenkins, ArgoCD, or similar)
Champion observability, monitoring, and reliability engineering practices
Build and optimize infrastructure that powers large-scale, data-driven pipelines at massive scale
Serve as the technical authority for all DevOps matters, influencing and aligning engineering teams
Partner with engineering leadership to shape infrastructure roadmaps and technology choices
Requirements
Qualifications:

5+ years of deep, hands-on AWS experience (EKS, EC2, networking, IAM, scaling)
Proven success in senior DevOps / Cloud Engineering leadership roles
Expert knowledge of Terraform and modern IaC tools (OpenTofu, Terragrunt)
Strong Kubernetes expertise at scale (design, scaling, optimization)
Experience running high-scale, production-grade environments handling large data volumes
Excellent communication skills with the ability to influence, guide, and align teams
Solid scripting/automation skills (Bash, JavaScript, Python, or similar)
Familiarity with cloud-native monitoring & logging stacks (Prometheus, Grafana, ELK, CloudWatch, etc.)


We’d be lucky if you:

Experience with Amazon ECS
Proficiency with GitHub or similar platforms (GitLab, Bitbucket, etc.)
Exposure to event-driven architectures and autoscaling frameworks (KEDA or similar)