underline

Use of Dense and MOE Ai Architecture in LLMs

Underline

Share:

Through the past decade we have seen a flood in the development and innovation of new AI models across the seven seas. With cutting edge Transformer based LMs being just the tip of the ice berg, AI refining and deployment has been revolutionised by complex but smart algorithms that climb the mountain which before was only dreamt of-in stories older than the industrial revolution. Furthermore, in today's rapidly evolving era, two Ai algorithms have proven their might from the mob, named as Dense architecture and Moe architecture-both being the sharpest blade of their batch.

Dense architecture is the traditional and more popular path for most pioneers of ai empowering OpenAi, Google Gemini, Perplexity and much more to be named. Though aging more than a decade, this architecture was made popular by the release of ChatGPT in the year 2022 crushing billions of dollars in market cap within just a few days of release.The LMs made up of this architecture are heavy in size and consist of billions if not trillions of parameters in them requiring expensive compute and billions of dollars in hardware resources not easily obtained. In the process of training these models, a huge amount of local resources are used to cool off the hardware and throw the wastes which cause heavy environmental pollution.

Though being computationally heavy, dense models are the go-to option for many AI enthusiasts in building code that reads human language, interprets it and produces a result. Moreover, these models understand a text well and are smarter currently because of their genius architecture of transformers which fail even the mightiest of human brains on the planet. Dense models are easier to train, reducing human efforts and thus not being reliant on much skilled manpower with most dense models being trained by even first year computer science students. These models require simple layering making them easier to comprehend theoretically by even high schoolers and practically by undergraduates.

Being the prodigy of LLM architecture, MOE which stands for Mixture of Experts architecture is the more efficient and smart LLM architecture which saves heavy compute and engraves itself as the example of smartness kills the strongest being the limelight of attention over the past year. Though being the once failed idea that died before it even took shape in OpenAi labs, was reborn on January 20th of 2025 shaping the great 600+ billion parameter LLM of DeepSeek R1, crushing the market cap of most American companies including Nvidia and OpenAi. Rather than training the whole model in one go, the mixture of experts in architecture trains many smaller models (usually 10-50 Billion parameters) known as “Experts” who specialize in a specific field, the examples being math, english, image generation etc. The Mixture of Experts LLM architecture consists of a ‘Gate’ or often called Router model which classifies the user query into categories which are handled by the model’s “Experts”. The router does the job of picking the right experts for a specific query from which the algorithm only uses the experts needed for the specific query instead of using the whole model, thus saving compute and money. Some great examples of the mixture of experts architecture LLMs are X ai’s Grok, Mistral and the most popular DeepSeek which sunk the market cap of many companies on the day of its release. Though being controversially full of Chinese propaganda, it achieved a great milestone of training huge LLMs in limited hardware. Due to the political instability between the American and Chinese governments, Deepseek was forbidden from using American world class GPUs and was forced to think smart.

Both architectures have shaped the foundation of AI proving their might through shaping actual non living matter i.e machinery to think which is a great feat in itself. Even just 50 years ago, no human thought of the silly rocks they found everyday to one day start thinking as humans. “AI has not replaced us and never will, AI is the chariot that will let us get closer to the human brain itself.” With more architectures yet to be discovered, the human mind will most certainly foster greater inventions that will amaze even the most brilliant minds. We shouldn’t be afraid of AI as it can't replace the curious nature of humans which becomes greater exponentially as more things get discovered.

0 reads

Published on 10/28/2025

Image

Hardik Sharma Phuyal is a student at Deerwalk Sifal School who loves writing articles, exploring diverse topics, and engaging in creative discussions.

Hardik Sharma Phuyal

Grade 9

Roll No: 29047

41

More Articles from

Student

Underline
Leonardo Da Vinci was a great artist.

Leonardo Da Vinci was a great artist.

Published 5/10/2021
Israel Palestine history and wars

Israel Palestine history and wars

Published 7/15/2021
The American Civil War

The American Civil War

Published 9/24/2021
स्वच्छ हावा पानी

स्वच्छ हावा पानी

Published 1/10/2022
मेरा बाबा

मेरा बाबा

Published 1/20/2022
जिब्राे

जिब्राे

Published 2/14/2022
Songs

Songs

Published 2/22/2022
जिब्रोको प्रयोग

जिब्रोको प्रयोग

Published 2/24/2022
Book Review “The BFG”

Book Review “The BFG”

Published 2/25/2022
लाहुरे जीवन र देशकाे माया

लाहुरे जीवन र देशकाे माया

Published 3/2/2022
जिब्रोको प्रयोग

जिब्रोको प्रयोग

Published 3/10/2022
मिठो बोली

मिठो बोली

Published 3/10/2022
विद्याको महत्त्व

विद्याको महत्त्व

Published 3/19/2022
हिमाल

हिमाल

Published 6/24/2022
The Dislikings

The Dislikings

Published 9/1/2022
The World Without the Sky

The World Without the Sky

Published 10/18/2022
नमस्कार

नमस्कार

Published 11/14/2022
Is Growing Older a Pain?

Is Growing Older a Pain?

Published 11/15/2022
मेराे जीवनकाे लक्ष्य

मेराे जीवनकाे लक्ष्य

Published 12/7/2022
Fireflies

Fireflies

Published 1/9/2023
Pretty Cage Without Freedom

Pretty Cage Without Freedom

Published 1/13/2023
विदेशयात्रा

विदेशयात्रा

Published 1/24/2023
The Fish and The Fisher

The Fish and The Fisher

Published 2/16/2023
देउराली डाँडा

देउराली डाँडा

Published 3/3/2023
Musahar Tribe: Breaking Stereotypes and Promoting Rights

Musahar Tribe: Breaking Stereotypes and Promoting Rights

Published 3/17/2023
Bhimsen Thapa: Wise or Foolish?

Bhimsen Thapa: Wise or Foolish?

Published 5/23/2023
राष्ट्रिय सम्पत्ति

राष्ट्रिय सम्पत्ति

Published 6/19/2023
Time Travel

Time Travel

Published 6/27/2023
Time Travel

Time Travel

Published 7/4/2023
Mockingbird

Mockingbird

Published 8/16/2023
Time Travel

Time Travel

Published 8/30/2023
Quiet Moments

Quiet Moments

Published 11/28/2023
निर्मल र अद्भूत।

निर्मल र अद्भूत।

Published 3/2/2024
Looking Back

Looking Back

Published 3/7/2024
मनझाँक्री नाटकसङ्ग्रहमा सङ्ग्रहित ‘जिउँदाहरूकाे चिता’ नाटककाे समीक्षा:

मनझाँक्री नाटकसङ्ग्रहमा सङ्ग्रहित ‘जिउँदाहरूकाे चिता’ नाटककाे समीक्षा:

Published 7/5/2024
मेरो देशको सुगन्ध

मेरो देशको सुगन्ध

Published 8/24/2024
सङ्घर्ष

सङ्घर्ष

Published 11/14/2024
जीवनको यात्रा

जीवनको यात्रा

Published 8/27/2025
The Archer

The Archer

Published 2/12/2026
गल्लीको उज्यालो

गल्लीको उज्यालो

Published 2/13/2026