Optimize model token usage

langchain-ai/langchainjs

Based on 5 comments

TypeScript

Implement token-efficient patterns when working with AI models to optimize costs and performance. Key practices include: 1. Batch identical prompts into single API calls where supported

AI TypeScript

Reviewer Prompt

Implement token-efficient patterns when working with AI models to optimize costs and performance. Key practices include:

Batch identical prompts into single API calls where supported
Leverage provider-specific features for token efficiency
Use appropriate token counting methods for accurate estimation

Example of efficient batching with OpenAI:

// Inefficient: Multiple separate calls
const results = await Promise.all(
  inputs.map(input => model.generate(input))
);

// Efficient: Single batched call
const results = await model.generate(
  [promptValue], 
  { n: inputs.length }  // OpenAI charges input tokens only once
);

// For accurate token counting:
const tokenCount = await model.getNumTokensFromMessages(messages);

This approach can significantly reduce costs, especially for use cases with high input token counts relative to output. Different providers may offer varying batching capabilities - consult their documentation for optimal usage patterns.

Comments Analyzed

TypeScript

Primary Language

Optimize model token usage

Reviewer Prompt

Source Discussions

Add Repository

Private Repository