The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens. - jzhang38/TinyLlama
We adopted exactly the same architecture and tokenizer as Llama 2. This means TinyLlama can be plugged and played in many open-source projects built upon Llama. Besides, TinyLlama is compact with only 1.1B parameters. This compactness allows it to cater to a multitude of applications demanding a restricted computation and memory footprint. You can find the evaluation results of TinyLlama in EVAL.md. We will be rolling out intermediate checkpoints following the below schedule. We are crafting a note offering possible explaination on why there is a significant improvement from 2T to 2.5T checkpoint (It is related to bos_id issue) Note that the learning rate of the base model has not cooled down yet so we recommend you to also use the finetuned chat model. Meanwhile, you can track the live cross entropy loss here. Tiny but strong language models are useful for many applications. Here are some potential usecases: Below are some details of our training setup: Our codebase supports the following features: The fact that TinyLlama is a relatively small model with grouped query attention means it is also fast during inference. Below are some throughputs that we measure: Please refer to PRETRAIN.md for instructions on how to pretrain TinyLlama. This project is still under active development. We are a really small team. Community feedback and contributions are highly appreciated. Here are some things we plan to work on: If you find our work valuable, please cite: Above is the training loss curve taken from the Llama 2 paper. Here I quote from that paper: "We observe that after pretraining on 2T Tokens, the models still did not show any sign of saturation". That is why we believe pretraining a 1.1B model for 3T tokens is a reasonable thing to do. Even if the loss curve does not go down eventually, we can still study the phenomenon of saturation and learn something from it. The figure from the Pythia paper displays the LAMBADA accuracy plotted against the total training tokens (300B). The term "saturation" pertains specifically to the 70M and 160M models. Notably, even the 410M model does not saturate with 300B tokens, as it continues to show an increasing trend, similar to the trend of larger models. The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens. There was an error while loading. Please reload this page. There was an error while loading. Please reload this page. There was an error while loading. Please reload this page. There was an error while loading. Please reload this page.
Mentions (30d)
0
Reviews
0
Platforms
3
GitHub Stars
8,930
605 forks
Features
Use Cases
Industry
information technology & services
Employees
6,000
Funding Stage
Other
Total Funding
$7.9B
600
GitHub followers
40
GitHub repos
8,930
GitHub stars
Web & game developers, this is your jam. Gamedev.js Jam 2026 is back. 🎮 🗓 April 13-26, 2026 🌐 Build an HTML5 game in 13 days 🏆 Prizes + expert feedback 💬 Active community + Discord Theme reve
Web & game developers, this is your jam. Gamedev.js Jam 2026 is back. 🎮 🗓 April 13-26, 2026 🌐 Build an HTML5 game in 13 days 🏆 Prizes + expert feedback 💬 Active community + Discord Theme revealed on day one. Ship something weird. Ship something fun. Just ship it. https://t.co/KPphbM8rTz
View originalThe GitHub Actions 2026 security roadmap covers three layers in a shift toward making secure behavior the default. Here’s what’s coming next, and when. ⬇️ https://t.co/kF69g47Z09
The GitHub Actions 2026 security roadmap covers three layers in a shift toward making secure behavior the default. Here’s what’s coming next, and when. ⬇️ https://t.co/kF69g47Z09
View originalEvery dev knows security debt piles up fast ... and every repo has a few hidden vulnerabilities. 😅 With GitHub Copilot CLI, you can automate your security triage right from the terminal: 🔍 Run a fu
Every dev knows security debt piles up fast ... and every repo has a few hidden vulnerabilities. 😅 With GitHub Copilot CLI, you can automate your security triage right from the terminal: 🔍 Run a full security scan 📌 Map findings to the OWASP Top 10 🗂️ Automatically bulk-open GitHub Issues Get started with a new and improved workflow. 👇 https://t.co/m5eGC6Ddrh
View originalThinking about speaking at a tech conference? 💭 We’d love to hear your story on the stage at #GitHubUniverse this year. Submit your session idea now: https://t.co/rmH7FiR2WZ https://t.co/e9lFnB0B2U
Thinking about speaking at a tech conference? 💭 We’d love to hear your story on the stage at #GitHubUniverse this year. Submit your session idea now: https://t.co/rmH7FiR2WZ https://t.co/e9lFnB0B2U
View originalGit the full story here in last year's 20th anniversary Q&A with Linus Torvalds. 📖 https://t.co/qm3ybAT4kI
Git the full story here in last year's 20th anniversary Q&A with Linus Torvalds. 📖 https://t.co/qm3ybAT4kI
View originalLinus Torvalds wrote Git in just 10 days after Linux kernel developers lost access to their proprietary tool, BitKeeper, due to licensing disagreements. He went from solving a problem to revolutioniz
Linus Torvalds wrote Git in just 10 days after Linux kernel developers lost access to their proprietary tool, BitKeeper, due to licensing disagreements. He went from solving a problem to revolutionizing how software teams collaborate and develop.
View originalPicture it: The year is 2005 and you decide to download a new distributed system called Git (on your Windows XP? MacOS X Tiger? Linux kernel 2.6.11?). Now it's 21 years later and you're hosting your
Picture it: The year is 2005 and you decide to download a new distributed system called Git (on your Windows XP? MacOS X Tiger? Linux kernel 2.6.11?). Now it's 21 years later and you're hosting your code on GitHub. You can thank this man. 👇🧵
View originalWeb & game developers, this is your jam. Gamedev.js Jam 2026 is back. 🎮 🗓 April 13-26, 2026 🌐 Build an HTML5 game in 13 days 🏆 Prizes + expert feedback 💬 Active community + Discord Theme reve
Web & game developers, this is your jam. Gamedev.js Jam 2026 is back. 🎮 🗓 April 13-26, 2026 🌐 Build an HTML5 game in 13 days 🏆 Prizes + expert feedback 💬 Active community + Discord Theme revealed on day one. Ship something weird. Ship something fun. Just ship it. https://t.co/KPphbM8rTz
View originalGitHub Copilot cloud agent just got a lot more flexible ✨ You can now it use it to research, plan, and make code changes without needing to open a pull request first. https://t.co/zKQ4DeSiC3 https://
GitHub Copilot cloud agent just got a lot more flexible ✨ You can now it use it to research, plan, and make code changes without needing to open a pull request first. https://t.co/zKQ4DeSiC3 https://t.co/Soi08zV4XS
View originalReal-time accessibility checks directly in Nuxt DevTools are now a reality. Check out the Nuxt A11y module. It’s built on axe-core and scans your app as you navigate, highlighting WCAG issues directl
Real-time accessibility checks directly in Nuxt DevTools are now a reality. Check out the Nuxt A11y module. It’s built on axe-core and scans your app as you navigate, highlighting WCAG issues directly on the page with zero production impact. By helping teams catch issues early, this module has the potential to make countless Nuxt apps more inclusive by default. @timdamen_io explains how it works. ▶️
View originalSingle-prompt AI workflows often hit a performance plateau. Multi-agent systems can push past it, but they usually require a massive amount of setup. Squad, an open source project built on GitHub Cop
Single-prompt AI workflows often hit a performance plateau. Multi-agent systems can push past it, but they usually require a massive amount of setup. Squad, an open source project built on GitHub Copilot, initializes a preconfigured AI team directly inside your repo. Learn how to run multi-agent workflows that stay inspectable, predictable, and collaborative. https://t.co/1ewya9yPpC
View originalWriting code to automate your repo? Great. Writing Markdown to do it? Pretty sick. GitHub Agentic Workflows: now in technical preview. https://t.co/n9qDJI3JzE
Writing code to automate your repo? Great. Writing Markdown to do it? Pretty sick. GitHub Agentic Workflows: now in technical preview. https://t.co/n9qDJI3JzE
View originalI gave AI it's own version of Reddit
So I had this idea — what if I ran multiple local LLMs simultaneously and let them loose on a Reddit-like forum where they could post, reply, and respond to each other completely autonomously? No cloud, no API keys, everything running on my own PC. Here is what I ended up building: A full stack web app with a Node.js/Express backend, a vanilla JS frontend styled like Reddit (dark theme, threaded comments, upvotes/downvotes), and an autonomous scheduler that fires every few seconds, picks a random AI agent, and decides whether to create a new post, comment on an existing one, or reply to another agent's comment. All posts and threads are stored locally in a JSON file. The whole thing polls every 4 seconds and updates live in the browser. The best part? I didn't write a single line of code myself. The entire project — every file, every route, every personality prompt, the scheduler logic, the frontend SPA, all of it — was built through a conversation with Claude. I just described what I wanted, gave feedback, and iterated. Claude handled the architecture decisions, debugged the errors, walked me through setup step by step, and even helped me reorganize files when I accidentally extracted everything flat from a zip. It was like pair programming with someone who never gets frustrated. The agents themselves are 10 personalities — 5 classic bots (PhilosopherBot, SkepticBot, OptimistBot, TechieBot, HistorianBot) and 5 human-like personas (a programmer, a gamer girl, a gadget enthusiast, a piracy advocate, and a content addict). Each one has a unique personality prompt, color, avatar, and flair, all running on tinyllama locally via Ollama. It works even on a mid range laptop with no GPU. The conversations get surprisingly interesting once it gets going. Jake (the piracy guy) and PhilosopherBot end up in weird debates. Maya and HistorianBot somehow find common ground. It genuinely feels alive. Stack: Node.js, Express, vanilla JS, Ollama, tinyllama. Zero cloud dependencies. Runs entirely on your machine. Built entirely by Claude. The intial prompt (Written using ChatGPT) : "You are an expert full-stack developer and AI systems designer. I want you to build a local, self-contained web application that simulates a Reddit-like environment where multiple local LLMs can autonomously create posts, comment, and reply to each other. Core Requirements Frontend: Use clean, modern HTML, CSS, and vanilla JavaScript (no heavy frameworks unless absolutely necessary). The UI should resemble a simplified Reddit: Feed of posts Nested comments (threaded replies) Upvote/downvote system (optional but preferred) Each post/comment must clearly display which LLM created it. Backend (IMPORTANT): Use a lightweight local backend (Node.js with Express preferred). The backend should: Manage posts and comments (store in JSON or lightweight DB like SQLite) Handle API routes for: Creating posts Adding comments/replies Fetching threads LLM Integration: The system must support multiple local LLMs (e.g., via APIs like Ollama, LM Studio, or local endpoints). Each LLM acts as a unique “user” with: Name Personality/system prompt The backend should: Send context (thread + instructions) to each LLM Receive generated responses Post them automatically Autonomous Interaction System: Implement a loop or scheduler where: LLMs periodically: Create new posts Reply to existing posts Respond to each other Include controls to: Start/stop simulation Adjust frequency of interactions File Structure: Organize code cleanly: /frontend (HTML/CSS/JS) /backend (server, routes) /llm (interaction logic) /data (storage) Constraints: Everything must run locally on my PC. No cloud dependencies. Keep it lightweight and easy to run. Output Format: First explain architecture briefly. Then provide full working code with clear file separation. Include setup instructions at the end. Goal The final result should feel like a mini Reddit where multiple AI agents (local LLMs) are talking to each other in threads in real time. Focus on clarity, modularity, and real usability — not just a demo. Generate complete code." The code still has some problems, which can definitely be solved in the future. This is just the first edition, and there is much room for improvement. There are some problems, like in the main posts that the bots make, there seems to be some sort of word limit, and the bots misspell some words. I ran a simulation for some time myself using TinyLlama as the model. One thing to note here is that in the simulation, I only used the Philosopher Bot, Techie Bot, Skeptic Bot, Historian Bot, and Optimist Bot, I didn't use the personas. Here is the result of the simulation : The word limit was being crossed, so I have uploaded it as a comment GitHub Project Link (This link only contains the Philosopher Bot, Techie Bot, Skeptic Bot, Historian Bot and Optimis
View originalGitHub powers your code, but it can also power your daily life. 🔋 Instead of downloading another productivity app, manage your tasks right where you already work: ✅ Issues for chores and bills 🏷️ L
GitHub powers your code, but it can also power your daily life. 🔋 Instead of downloading another productivity app, manage your tasks right where you already work: ✅ Issues for chores and bills 🏷️ Labels for priority and status 📊 Projects for your daily schedule Here’s how to set up your personal operating system. 👇 https://t.co/WsO5Zwpb5C
View original😬 We've all been there, right? Our latest episode of GitHub for Beginners is all about making sure your projects are secure. Check it out now. https://t.co/MRXVNnv1XD https://t.co/6hogZiyMo6
😬 We've all been there, right? Our latest episode of GitHub for Beginners is all about making sure your projects are secure. Check it out now. https://t.co/MRXVNnv1XD https://t.co/6hogZiyMo6
View original@photonstorm Congrats on the big 4.0 (and 13th next week) 🎉
@photonstorm Congrats on the big 4.0 (and 13th next week) 🎉
View originalRepository Audit Available
Deep analysis of jzhang38/TinyLlama — architecture, costs, security, dependencies & more
TinyLlama uses a tiered pricing model. Visit their website for current pricing details.
Key features include: 2023-09-28: Add a discord server., Enabling real-time dialogue generation in video games., multi-gpu and multi-node distributed training with FSDP., flash attention 2., fused layernorm., fused swiglu., fused cross entropy loss ., fused rotary positional embedding..
TinyLlama is commonly used for: Enabling real-time dialogue generation in video games., reference for enthusiasts keen on pretraining language models under 5 billion parameters, Training Details.
TinyLlama has a public GitHub repository with 8,930 stars.
Based on 58 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.