About LiveX AI

LiveX AI is the #1 human-like physical AI agent platform for B2C enterprises, unifying digital and real-world interactions into one intelligent system.

About

Machine Learning Engineer — Realtime Interactive Avatars

Location

Palo Alto

Department

Engineering

Type

On-Site

Salary

TBD

Job Description

Posted on: 
May 27, 2026

Job Overview: LiveX AI is building the next generation of realtime, interactive AI avatars—lifelike digital humans that see, listen, and respond with natural speech and expressive motion. We are hiring a strong Machine Learning Engineer to help us train, fine-tune, and serve the generative models that power these avatars. You will work across the full model lifecycle—from data curation and training of diffusion, video, and multimodal models, to low-latency inference optimization for live, streaming deployments.

Your work will directly shape how millions of end users experience AI—turning static chatbots into engaging, face-to-face digital humans that deliver VIP-level customer experiences in real time.

Responsibilities:

  • Train and fine-tune state-of-the-art generative models for talking-head synthesis, audio-driven facial animation, and full-body avatar video (e.g., diffusion models, DiT, 3D Gaussian Splatting, GAN-based and transformer-based video generators).
  • Design data pipelines for large-scale video, audio, and multimodal datasets—including collection, cleaning, labeling, alignment, and augmentation.
  • Optimize models for realtime, low-latency serving (sub-200ms end-to-end): model distillation, quantization, pruning, caching, streaming inference, and custom CUDA/Triton kernels where needed.
  • Build and maintain the training, evaluation, and deployment infrastructure on GPU clusters (PyTorch, DeepSpeed, FSDP, Ray, Kubernetes).
  • Define and track quality metrics—lip-sync accuracy, identity preservation, expression naturalness, latency, and throughput—and drive measurable improvements.
  • Collaborate closely with backend, product, and research teams to ship avatar features into production and iterate rapidly based on real user feedback.
  • Stay current with the latest research in generative video, multimodal AI, and neural rendering, and bring novel techniques into our stack.

Qualifications:

  • 3+ years of hands-on ML engineering experience (or strong research experience with production-quality code), with a track record of shipping deep learning models.
  • Deep expertise in PyTorch and modern deep learning workflows; comfortable reading, reproducing, and extending research papers.
  • Strong background in one or more of: diffusion models, video generation, generative AI, computer vision, multimodal AI (audio + vision), speech synthesis, or neural rendering.
  • Experience with talking-head / audio-driven animation / lip-sync / digital human systems, or closely related areas (3D Gaussian Splatting, NeRF, face reenactment, avatar synthesis) is a strong plus.
  • Proven ability to optimize models for realtime inference—familiarity with TensorRT, ONNX, Triton Inference Server, CUDA, mixed-precision, and streaming pipelines.
  • Solid software engineering fundamentals in Python; experience with distributed training on multi-GPU / multi-node setups.
  • Comfortable working in ambiguous and rapidly evolving environments, with a proactive, ownership-driven mindset.
  • Fast learner with a startup mentality, eager to push the frontier of what interactive AI avatars can do.
  • Publications at top venues (NeurIPS, CVPR, ICCV, SIGGRAPH, ICLR, ICML) or notable open-source contributions are a plus.

What We Offer:

  • A rare opportunity to build realtime interactive avatars at the intersection of generative video, speech, and multimodal AI.
  • Access to substantial GPU compute and a team that ships to real customers.
  • A collaborative, innovative, and dynamic work environment.
  • Competitive salary, equity, and benefits package.
  • Career growth opportunities in a rapidly expanding sector.

How to Apply:

Please submit your resume, a short note on the most interesting generative / avatar / multimodal project you have worked on, and any relevant publications, demos, or portfolio links to contact@livex.ai. We’re excited to hear from you!

LiveX AI is an Equal Opportunity Employer.

Company Description:

LiveX AI Inc. is a trailblazing AI SaaS company dedicated to empowering everyday living. Our skilled team, drawn from top-tier institutions and companies, is committed to leveraging AI to enhance customer experience and business efficiency, reshaping how businesses interact with technology. Join us in building the future of realtime interactive AI avatars.

Email Resume →

Machine Learning Engineer — Realtime Interactive Avatars

Location

Palo Alto

Department

Engineering

Type

On-Site

Salary

TBD

Job Description

Job Overview: LiveX AI is building the next generation of realtime, interactive AI avatars—lifelike digital humans that see, listen, and respond with natural speech and expressive motion. We are hiring a strong Machine Learning Engineer to help us train, fine-tune, and serve the generative models that power these avatars. You will work across the full model lifecycle—from data curation and training of diffusion, video, and multimodal models, to low-latency inference optimization for live, streaming deployments.

Your work will directly shape how millions of end users experience AI—turning static chatbots into engaging, face-to-face digital humans that deliver VIP-level customer experiences in real time.

Responsibilities:

  • Train and fine-tune state-of-the-art generative models for talking-head synthesis, audio-driven facial animation, and full-body avatar video (e.g., diffusion models, DiT, 3D Gaussian Splatting, GAN-based and transformer-based video generators).
  • Design data pipelines for large-scale video, audio, and multimodal datasets—including collection, cleaning, labeling, alignment, and augmentation.
  • Optimize models for realtime, low-latency serving (sub-200ms end-to-end): model distillation, quantization, pruning, caching, streaming inference, and custom CUDA/Triton kernels where needed.
  • Build and maintain the training, evaluation, and deployment infrastructure on GPU clusters (PyTorch, DeepSpeed, FSDP, Ray, Kubernetes).
  • Define and track quality metrics—lip-sync accuracy, identity preservation, expression naturalness, latency, and throughput—and drive measurable improvements.
  • Collaborate closely with backend, product, and research teams to ship avatar features into production and iterate rapidly based on real user feedback.
  • Stay current with the latest research in generative video, multimodal AI, and neural rendering, and bring novel techniques into our stack.

Qualifications:

  • 3+ years of hands-on ML engineering experience (or strong research experience with production-quality code), with a track record of shipping deep learning models.
  • Deep expertise in PyTorch and modern deep learning workflows; comfortable reading, reproducing, and extending research papers.
  • Strong background in one or more of: diffusion models, video generation, generative AI, computer vision, multimodal AI (audio + vision), speech synthesis, or neural rendering.
  • Experience with talking-head / audio-driven animation / lip-sync / digital human systems, or closely related areas (3D Gaussian Splatting, NeRF, face reenactment, avatar synthesis) is a strong plus.
  • Proven ability to optimize models for realtime inference—familiarity with TensorRT, ONNX, Triton Inference Server, CUDA, mixed-precision, and streaming pipelines.
  • Solid software engineering fundamentals in Python; experience with distributed training on multi-GPU / multi-node setups.
  • Comfortable working in ambiguous and rapidly evolving environments, with a proactive, ownership-driven mindset.
  • Fast learner with a startup mentality, eager to push the frontier of what interactive AI avatars can do.
  • Publications at top venues (NeurIPS, CVPR, ICCV, SIGGRAPH, ICLR, ICML) or notable open-source contributions are a plus.

What We Offer:

  • A rare opportunity to build realtime interactive avatars at the intersection of generative video, speech, and multimodal AI.
  • Access to substantial GPU compute and a team that ships to real customers.
  • A collaborative, innovative, and dynamic work environment.
  • Competitive salary, equity, and benefits package.
  • Career growth opportunities in a rapidly expanding sector.

How to Apply:

Please submit your resume, a short note on the most interesting generative / avatar / multimodal project you have worked on, and any relevant publications, demos, or portfolio links to contact@livex.ai. We’re excited to hear from you!

LiveX AI is an Equal Opportunity Employer.

Company Description:

LiveX AI Inc. is a trailblazing AI SaaS company dedicated to empowering everyday living. Our skilled team, drawn from top-tier institutions and companies, is committed to leveraging AI to enhance customer experience and business efficiency, reshaping how businesses interact with technology. Join us in building the future of realtime interactive AI avatars.

Apply now