We find that, just as a large transformer model trained on language can generate coherent text, the same exact model trained on pixel sequences can generate coherent image completions and samples. By establishing a correlation between sample quality and image classification accuracy, we show that our best generative model also contains features competitive with top convolutional nets in the unsupervised setting.
Latest Briefs
Fast updates from the latest stories.
CYBERSECURITY
+1
Strengthening Cyber Defenses Against AI-Driven Vulnerabilities
Apr 16, 2026
LEADERSHIP
+2
CXOs Transform Work Models for Enhanced Performance in 2026
Apr 16, 2026
STARTUPS
+2
Daily Startup News Roundup: April 16, 2026
Apr 16, 2026
INVESTMENT
+3
Tonino Lamborghini Ventures into India's Luxury Homes Market with ₹4,000 Crore Project
Apr 16, 2026