NewMathData – LLM Implementation & Fine-Tuning via AWS Bedrock

Industry
Social media
Project type
Case study
Date
March 7th, 2024

Project details

Customizing a domain-specific LLM to power intelligent search and knowledge discovery across scientific datasets.
We’ve helped NewMathData build a data-first platform focused on enabling scientific and mathematical research teams to discover, interact with, and extract value from massive, complex datasets. The company needed a tailored LLM solution to support advanced semantic search, data summarization, and conversational interfaces for non-technical users.

Services

0
Games monetize with Xsolla products and services on mobile, PC and the web
0
Payment methods accepted across 130+ currencies around the world
0 %
Average revenue increase for partners when they launch local payments through Xsolla
0

Weeks to launch from client brief
to MVP

Challenge

Off-the-shelf LLMs lack the contextual understanding necessary to interpret highly specialized scientific data. NewMathData needed a custom implementation that could:

  • Ingest and interpret proprietary datasets with precision
  • Surface meaningful relationships between variables and documents
  • Enable researchers to ask natural-language questions and receive useful, explainable results

The solution needed to be secure, scalable, and built on reliable cloud infrastructure suitable for enterprise and research-grade environments.

Approach

We partnered with NewMathData to scope, deploy, and fine-tune a domain-specific LLM using AWS Bedrock, Amazon’s secure and scalable platform for foundation model deployment.

Our approach included:

  • LLM selection and orchestration via Bedrock
  • Data preprocessing and vectorization using Amazon SageMaker and embedding pipelines
  • Fine-tuning the foundation model on NewMathData’s proprietary corpus of structured and unstructured data
  • Integrating with a semantic search layer and a natural-language Q&A interface
  • Ensuring end-to-end security, access control, and cost optimization

Outcome

  • 90%+ query relevance score for research-specific questions after fine-tuning
  • Enabled NewMathData users to interact with their datasets conversationally
  • Improved research efficiency and discovery time for key insights
  • Delivered a scalable AWS-native implementation ready for enterprise deployment

What We Did

01/Foundation Model Setup on AWS Bedrock

02/Data Preparation & Embedding Pipeline

03/LLM Fine-Tuning & Instruction Tuning

04/Q&A + Semantic Search Integration

05/Governance, Cost Optimization, and Scaling

Our Portfolio