Bill Cai

About Me

Hello! I'm Bill, a ML researcher/practitioner based in Singapore. I lived in Chicago, Boston, and the Bay Area for 5 years before moving back to Singapore in 2019.

I enjoy building machine learning algorithms and systems that have a real-world impact on people's lives. My goal is to build end-to-end ML products and solutions that engineers enjoy working on, and have positive outcomes for society.

After graduating from the research-based master's program from MIT's Center for Computational Engineering, I joined the data science team at One Concern to build large-scale computer vision and network modelling systems to estimate impacts of natural disasters. I worked at GovTech Singapore for 3 years where my team and I built a cloud-based image and video analytics platform. Now, I am at Amazon Web Services leading science and ML research work for Generative AI applications across text and image applications.

Here's what I've been up to lately:

Deep Learning, Computer Vision
Applied Math, Statistics
Tensorflow, PyTorch
Flask, SQL, Javascript
Kubernetes, Docker, Serverless
Python, Julia, R
AWS Architectures

Where I've Worked

Senior Applied Scientist @ Amazon Web Services

2022 - current

The AWS Generative AI Innovation Center collaborates globally to deploy generative AI solutions. I have led science efforts and key strategic customer engagements in ASEAN, India, and Korea.
Led efforts to optimize and improve internal LLMs, with reductions in memory footprint by 70% and deployment costs by 50-80%, without compromising performance
Tech lead for novel genAI applications, including usage and benchmarking of LLMs for education, LLM agents for financial services, accurate and verifiable LLM analysis for legal and investment professionals.
Contributor to popular and emerging open-source libraries for LLM inference and benchmarking, including AlpacaEval, AutoGPTQ, and Text Generation Inference.

Some Things I've Built

Featured Project

Multimodal harmful meme detection

I won second place in an international competition to detect harmful content in multimodal and multilingual memes. The competition is done in a zero-shot fashion, with no training or validation data released. I built a reusable pipeline that can robustly evaluate memes in under 2 seconds, and a QLoRA fine-tuned LLM, trained on a synthetically generated dataset, with BLIP captioning and multilingual OCR. Competition report is published as a short paper in ACM Web Conference 2024 proceedings.

Multimodal LLMs
Synthetic data

Featured Project

GenAI for Education

I led a collaboration project with Singapore's Ministry of Education to implement and benchmark LLMs, and evaluate open-ended language tasks to create solutions to assist educators and expand student access to feedback. This led to a successful implementation on AWS, and also a research paper that has been accepted to NAACL 2024. Also made open-source contributions to popular LLM benchmark library Alpaca-Eval.

LLMs
HF Transformers
PyTorch

Featured Project

WOG AI/ML Platform

I led a collaboration project with Singapore's Government Technology Agency to integrate generative AI capabilities into GovTech's ML/AI platform called MAESTRO. Through the project, we collaborated on inference optimizations for large language models, achieving 75% cost reduction in inference costs with no significant decrease in quality benchmarks. Public sector users develop on the platform to enable diverse and impactful use cases; for example, CPF Board is using the platform for call transcript summarization for over 600,000 call transcripts per year and Ministry of Manpower developed a LLM-powered sensemaking tool to process over 1 million documents and save over 2,000 work hours annually.

LLMs
Inference Optimization
HuggingFace TGI

Featured Project

Regional LLMs

I led the engineering efforts to integrate LLMs for regional languages in SageMaker Jumpstart. This includes SEA-LION, a 3B-7B model trained on Southeast Asian languages. These include making changes to popular optimized LLM inference libraries like Text Generation Inference.

Optimized LLM inference
HF Transformers
PyTorch

Featured Project

GenAI Storytelling

I led the science and research work to enable immersive storytelling with LLMs and diffusion models. I worked on decreasing overall latency from 5 mins to under 30 seconds, measuring and enhancing art style and character consistency, and reducing occurences of undesirable content. The immersive storytelling was an overwhelming success for National Libray Board, resulting in nationwide news coverage and repeated extensions of the project and display.

LLMs
Diffusion Models
PyTorch

Featured Project

Crowd Estimation at Scale

Working closely with stakeholders in National Parks Board, my team and I developed a cloud-based crowd estimation system. By connecting over 180 cameras nationwide to our AI-enabled people counting backend, we were able to reduce and optimise the heavy operational requirements for NParks to deploy their offices to ensure park safety, and also provided members of the public with useful real-time data to self-manage their own itineraries. This event-driven system was delivered in an iterative approach, scaling quickly from our initial pilot of 5 to 100+ cameras within 3 months. The public-facing website receives over 450k-800k monthly visits since 2020, with the project receiving media coverage by Singapore primetime news.

AWS
PyTorch
Serverless

Featured Project

Treepedia

Treepedia measures the canopy cover in cities. We've developed a scalable and universally applicable method by analyzing the amount of green perceived while walking down the street. The visualization maps street-level perception only, so your favorite parks aren't included. Our work has been featured on popular news outlets such as Wall Street Journal, The Guardian, Forbes and Wired. Our Treepedia 2.0 paper is accepted at the IEEE BigData Congress 2018, and we explore our deep learning application in the area of climate change in a NeurIPS 2019 Climate Change workshop paper.

Tensorflow
Python
PostGIS

Other Noteworthy Projects

view the archive

Vehicle Classification API

A vehicle classification model. A REST API that classifies vehicles. An auto-scaling backend that classifies vehicles running on Kubernetes. All in one repo.

Parking Utilisation

Featured on Fortune magazine and NVIDIA Developer News, our project optimises for the minimal number of video frames required for accurate parking utilisation measurements. This enables large-scale quantification of parking usage for urban planning purposes. Our results are published in the IEEE Internet of Things Journal.

Roboat

Sensor-fusion between lidar and RGB images for motion planning and obstacle avoidance for boats. Featured on CNN and CNBC.

What's Next?

Get In Touch

I'm always keen to chat about new ideas and collaboration opportunities.

Say Hello

About Me

Where I've Worked

Senior Applied Scientist @ Amazon Web Services

2022 - current

Computational Scientist @ GovTech

2019 - 2022

Computer Vision Scientist @ One Concern

2018 - 2019

Graduate Researcher @ Senseable City Lab

2017 - 2018

Some Things I've Built

Featured Project

Multimodal harmful meme detection

Featured Project

GenAI for Education

Featured Project

WOG AI/ML Platform

Featured Project

Regional LLMs

Featured Project

GenAI Storytelling

Featured Project

Crowd Estimation at Scale

Featured Project

Treepedia

Other Noteworthy Projects

Vehicle Classification API

Parking Utilisation

Roboat

What's Next?

Get In Touch