Embedl's Blog on Deep Learning
Lightning-Fast Multimodal Edge Inference with Under 8GB RAM
Running advanced multi-modal reasoning models on edge hardware has traditionally required large GPUs and tens of gigabyt...
Cosmos Reason 2 Without the Quantization Trade-Off
We have just released embedl/Cosmos-Reason2-2B-W4A16-Edge2, a new mixed-precision variant of Cosmos Reason 2 that recove...
Blackwell-optimized Cosmos Reason 2
Today, we are releasing embedl/Cosmos-Reason2-2B-NVFP4A16, a new Blackwell-optimized variant of Cosmos Reason 2. This mo...
Cosmos Reason 2, Quantized for the Edge
Today we’re releasing the first quantized version of Cosmos Reason 2, which runs efficiently on the Jetson Nano Super: e...
The cost of running frontier AI models
Research groups pushing the limits of artificial intelligence are running into a new kind of barrier. The economic cost ...
Ultra-Efficient SLMs: Embedl’s Breakthrough for On-Device AI
Embedl has released a major milestone for efficient LLMs: introducing FlashHead, a training-free, hardware-friendly drop...
Intelligence per Watt: Edge versus Cloud
A new paper1 from a group at Stanford has set up new metrics to evaluate the energy efficiency of AI models: intelligenc...
EDGE AI Talks: Faster Time-To-Device with Embedl Hub
Watch EDGE AI Talks: Faster Time-To-Device with Embedl Hub, with Our Product Owner Andreas Ask. Edge AI is redefining ho...