Each GitHub Copilot plan includes a monthly allowance of AI credits. Different actions consume credits at different rates, based on the model and the number of tokens processed. This guide covers practical ways to get the most out of your AI credits in Visual Studio Code.
More capable models cost more per token, while lighter models extend your usage further. Match the model to the complexity of the task:
The model picker in chat shows cost details in the hover menu, including cost per token type and a generic cost tier label (Low, Medium, High). Use this information to make informed choices.
For more information, see choosing and configuring language models and best practices for model selection.
Jumping straight into code generation can lead to wasted effort if the approach is wrong. It also requires a model with enough reasoning capability throughout the process, which can consume more credits. Instead, separate the planning and implementation phases. This allows you to use a reasoning model for planning, and then switch to a faster, more efficient model for implementation once the plan is solidified.
This workflow ensures the agent understands the requirements before it starts generating code, reducing back-and-forth and rework.
For more information, see plan first, then implement.
Thinking effort controls how much reasoning a model applies to each request. Higher effort levels produce more thinking tokens, which increases both latency and credit consumption. VS Code sets default effort levels based on evaluations and has adaptive reasoning enabled, where the model dynamically decides how much to think based on the complexity of each request.
For most tasks, the defaults are sufficient. Only increase thinking effort for genuinely complex problems like architectural planning or multi-step debugging.
For more information, see configure thinking effort.
As a conversation grows, it accumulates context from previous messages, tool outputs, and file contents. When you switch to an unrelated task in the same session, the model still processes all that irrelevant history, which consumes tokens without improving results.
Start a new chat session (⌘N (Windows, Linux Ctrl+N)) when you change topics. This gives the model a clean context window focused on the current task.
When you want to explore an alternative approach or ask a side question, fork the conversation instead of re-prompting from scratch. Forking creates a new session that inherits the existing conversation history, so you don't need to re-establish context.
/fork in the chat input to fork the entire session up to the current message.Every tool call produces output that consumes space in the context window and contributes to credit consumption. Disable tools you don't need for the current task to prevent unnecessary calls.
tools property. This prevents the agent from calling tools that aren't relevant to its workflow.For more information, see Use tools in chat.
Large generated files, build outputs, or irrelevant directories can be included in the AI context, increasing token usage without adding value. Exclude these files to reduce unnecessary context:
.gitignore file to exclude files from the workspace index. The workspace index respects .gitignore rules.For more information, see workspace context.
When a conversation grows long, use /compact to summarize older parts of the conversation and reclaim context window space. You can optionally add instructions to guide the summary, for example /compact focus on the API design decisions.
For more information, see context compaction.
You can view your current Copilot usage in the Copilot status dashboard, available through the VS Code Status Bar. The dashboard shows the percentage of your monthly allowance you have used for AI credits (and inline suggestions for the Copilot Free plan).
Visit the GitHub Copilot documentation for more information about monitoring usage and entitlements.
You can also run the /chronicle:cost-tips command in any chat session to get personalized recommendations for optimizing your AI credit usage based on your recent activity. Learn more about session insights and the chronicle command.
Use the Agent Debug Logs to understand what is consuming credits in a session:
Reviewing these logs helps you identify sessions or workflows that consume more tokens than expected, so you can adjust your approach.