My job alerts

Cloud Infrastructure / Platform Engineer

1Mind

This job is no longer accepting applications

See open jobs at 1Mind.See open jobs similar to "Cloud Infrastructure / Platform Engineer" North Bay Startup Jobs.

Software Engineering, Other Engineering

Posted 6+ months ago

Location

Remote - US

Employment Type

Full time

Location Type

Remote

Department

Engineering

About Us

1mind is a platform that deploys multimodal Superhumans for revenue teams. These Superhumans combine a face, a voice, and a GTM brain — equipped with deep technical and product knowledge. They can lead unlimited, simultaneous conversations 24/7, meeting buyers when they’re most active and engaged. Superhumans qualify leads, book meetings, deliver pitches, give interactive demos, handle objections, uncover pain points, build value models, provide support, and onboard customers. They live across websites, inside your product, can join live calls as active participants, and work alongside your team in deal rooms. 1mind Superhumans integrate seamlessly into existing workflows, scale instantly, and drive measurable impact — growing revenue, reducing headcount, accelerating pipeline to closed-won, and creating a more delightful buyer experience.

Job Description

The Cloud Infrastructure team builds and maintains the platforms and abstractions that let 1mind ship quickly, securely, and at scale. You’ll design and operate development and production platforms that underpin multimodal, low-latency AI experiences. Like every engineering team here, we own the reliability of what we build, including participating in an on-call rotation for critical incidents.

Key Responsibilities

Design, build, and operate Kubernetes-based platforms (multi-cluster/multi-region) for high availability, security, and cost efficiency.
Create cloud abstractions and paved-paths (IaC modules, service templates, golden images) that accelerate safe and consistent delivery.
Own CI/CD, deployment strategies, and runtime orchestration for services handling real-time, high-throughput Superhuman interactions.
Evolve networking and traffic management (ingress, gateways, service mesh, edge/CDN, load balancing, DNS) with strong SLOs.
Implement defense-in-depth: identity and access, secrets management, isolation, policy as code, auditability, and compliance guardrails.
Build observability into everything (metrics, logs, traces, profiling), define and monitor SLOs/SLIs, and drive incident response & postmortems.
Plan capacity and scale the platform by an order of magnitude: autoscaling, workload placement, caching strategies, and storage choices.
Partner with ML/IR, product, and backend teams to deliver reliable foundations for retrieval, memory, and tool-call workflows.
Contribute to an inclusive culture that welcomes diverse perspectives and enables candid and constructive debate.

Qualifications

5+ years building and operating core infrastructure or platform engineering systems.
Hands-on experience running Kubernetes at scale (cluster lifecycle, operators/controllers, workloads, security).
Proven track record creating abstractions over cloud platforms (modules/libraries, internal platforms, or PaaS-like experiences).
Strong grounding in networking (L4/L7), container runtime, Linux, storage, and distributed systems fundamentals.
Proficiency with Infrastructure as Code and automation (e.g., Terraform/Pulumi, GitOps, CI/CD).
Focus on building scalable, reliable, and secure systems; comfortable with ambiguity and rapid change.
Plus: Experience with service mesh (e.g., Istio/Linkerd), policy as code (OPA), secrets/PKI, multi-cloud, GPU scheduling, or cost governance.
Plus: Exposure to GCP/AWS, edge/CDN, data plane performance, or compliance frameworks.

Why Join Us?

Build the platform foundation behind multimodal Superhumans redefining enterprise GTM and support.
Small, high-ownership team shipping meaningful infrastructure used across products.
Work at the intersection of AI, systems, and safety with measurable business impact.
Global footprint (San Francisco & Bengaluru) and a culture that values learning from production while prioritizing responsible, safe deployment.

Reliability & On-Call

We own the reliability of the systems we build. This role includes participating in an on-call rotation and responding to critical incidents when needed.

Location: Remote (U.S.)

Employment Type: Full-time