Head of Supercomputing
Company: Etched
Location: San Jose
Posted on: April 1, 2026
|
|
|
Job Description:
About Etched Etched is building the world’s first AI inference
system purpose-built for transformers — delivering over 10x higher
performance and dramatically lower cost and latency than a B200.
With Etched ASICs, you can build products that would be impossible
with GPUs, like real-time video generation models and extremely
deep & parallel chain-of-thought reasoning agents. Backed by
hundreds of millions from top-tier investors and staffed by leading
engineers, Etched is redefining the infrastructure layer for the
fastest growing industry in history. Job Summary Etched is building
at-scale AI inference supercomputers powered by our custom ASICs,
and the Supercomputing organization is responsible for making them
real, deployable, and reliable. We are seeking a Head of
Supercomputing to define and lead the architecture, software stack,
and operational model for Etched’s cluster-scale AI compute
systems. This leader will own the end-to-end system software and
control-plane strategy — spanning orchestration, telemetry,
provisioning, networking, and fleet reliability — from first
silicon through production deployment. This role combines deep
systems expertise with strong organizational leadership. You will
build and lead a world-class team, partnering closely with ASIC,
hardware, kernel, runtime, and infrastructure teams to deliver the
highest-performance AI inference systems in the world. Key
Responsibilities Define and drive the technical vision and roadmap
for Etched’s Supercomputing software stack, from node bring-up to
multi-rack clusters Build, scale, and lead a high-performance
Supercomputing organization Directly manage and develop 15
engineers with a variety of experience levels Architect and own
low-level control-plane software for system bring-up, provisioning,
networking, configuration, and fleet management Define
orchestration primitives for managing devices, nodes, racks, and
full cluster deployments Oversee development of system services
that interface directly with firmware, drivers, kernel subsystems,
and runtime layers Establish system telemetry and observability
infrastructure for customers Own the system software lifecycle from
first silicon bring-up through stable production releases
Collaborate with manufacturing and test engineering to integrate
diagnostics and system software into factory environments Define
reliability targets, operational metrics, and release processes for
production deployments Recruit, mentor, and retain exceptional
systems engineers and engineering leaders Act as a senior technical
voice shaping company-wide infrastructure decisions Must-Have
Skills and Experience 15 years of experience in system software,
infrastructure, or large-scale compute systems, including 5 years
leading engineering teams Strong understanding of hardware/software
interfaces such as PCIe, RDMA, memory hierarchies, interrupts, and
device drivers Experience building or operating cluster-scale
systems (HPC, AI infrastructure, hyperscale compute, or custom
accelerators) Proven track record of delivering complex systems
from early bring-up through production Strong debugging skills
across hardware–software interactions Excellent leadership,
communication, and cross-functional collaboration skills Benefits
Medical, dental, and vision packages with generous premium coverage
$500 per month credit for waiving medical benefits Housing subsidy
of $2k per month for those living within walking distance of the
office Relocation support for those moving to San Jose (Santana
Row) Various wellness benefits covering fitness, mental health, and
more Daily lunch dinner in our office How we’re different Etched
believes in the Bitter Lesson . We think most of the progress in
the AI field has come from using more FLOPs to train and run
models, and the best way to get more FLOPs is to build
model-specific hardware. Larger and larger training runs encourage
companies to consolidate around fewer model architectures, which
creates a market for single-model ASICs. We are a fully in-person
team in San Jose (Santana Row), and greatly value engineering
skills. We do not have boundaries between engineering and research,
and we expect all of our technical staff to contribute to both as
needed.
Keywords: Etched, Berkeley , Head of Supercomputing, Engineering , San Jose, California