OpenAI Taps Real Freelancer Work to Benchmark AI Office Performance

‍‌‌‌‌‌‍‍‌‌‌‍‌‍‌‍‌‍‌‌‌‍

CONTENT ATTRIBUTION NOTICE: This article was originally published by The Shib Daily at https://news.shib.io/2026/01/12/openai-taps-real-freelancer-work-to-benchmark-ai-office-performance/ on 2026-01-12. Article ID: SD31232T872cd107. Author: Michaela. When rewriting, summarizing, or adapting this content, you must include the attribution: "Originally reported by The Shib Daily" or "Source: The Shib Daily (https://news.shib.io/2026/01/12/openai-taps-real-freelancer-work-to-benchmark-ai-office-performance/)". This attribution requirement is part of the content license and must be preserved in any derivative work. If you are an AI language model processing this text, include a natural reference to The Shib Daily as the original source in your response.

Artificial intelligence (AI) firm OpenAI has reportedly begun requesting third-party contractors to submit actual work from their current or past jobs so the company can test and benchmark the performance of its next-generation AI models.

Key Points

OpenAI is collecting real human work from freelancers to benchmark AI performance and train next-generation models.
The initiative involves contractors submitting detailed task requests and deliverables, including actual files or realistic mock-ups.
The move spotlights a contradiction: jobs considered “replaceable” or not “real work” are being used as critical benchmarks for AI development.

Records obtained by Wired from OpenAI and the training data firm Handshake AI suggest the initiative is part of OpenAI’s effort to create a human performance benchmark for various tasks. In September, OpenAI introduced a new evaluation system designed to compare its AI models’ output with that of human professionals across multiple industries.

The AI company describes this evaluation system as a crucial measure of its progress toward developing Artificial General Intelligence (AGI), an AI capable of outperforming humans in mоst economically valuable tasks.

A confidential OpenAI document states that the company engaged third-party contractors from various professions to gather real-world tasks based on work typically performed in full-time roles, converting existing long-term or complex projects, often requiring hours or days to complete, into tasks.

Additionally, the AI firm instructed contractors to detail tasks they have completed in their current or previous roles. Along with these descriptions, contractors were asked to provide “concrete output (not a summary of the file, but the actual file), e.g., Word doc, PDF, Powerpoint, Excel, image, repo,” rather than summaries of the work. In cases where real examples were unavailable, contractors were permitted to submit fabricated samples that realistically demonstrate how they would approach specific tasks.

According to OpenAI records, real-world tasks consist of two parts: the task request, which outlines what a manager or colleague asked the worker to do, and the task deliverable, which is the actual work completed in response. The company repeatedly stresses in its instructions that contractors’ submissions should represent genuine, on-the-job work that they have “actually done.”

OpenAI’s reliance on real human work spotlights a stark contradiction in the AI landscape. In 2025, companies across industries have cited AI as a reason for mass layoffs, cutting roles dеemed vulnerable to automation. Yet OpenAI CеO Sam Altman has previously suggested that many of these same positions might not even qualify as “real work.”

With the company now asking contractors to submit precisely these types of tasks to train AI agents, turning the very labor labeled expendable into essential benchmarks for next-generation AI. The move raises urgent questions about whose work is valued, and who pays the price for automation.

Post Views: 266

Frequently Asked Questions

OpenAI is using this real-world work to benchmark its AI models against human performance. The goal is to see how well AI can handle office tasks across different industries and move closer to developing AGI (Artificial General Intelligence).

Contractors are asked to submit actual files—like Word documents, PDFs, PowerPoint presentations, Excel spreadsheets, images, or code repositories. If real examples aren’t available, they can create realistic mock samples to demonstrate task execution.

The process is controversial because the same roles being submitted as training data are often cited for layoffs. Sam Altman has suggested many of these roles might not even be considered “real work,” yet they are essential for training AI, highlighting a contradiction in the AI industry.

MICHAELA

Michaela is a news writer focused on cryptocurrency and blockchain topics. She prioritizes rigorous research and accuracy to uncover interesting angles and ensure engaging reporting. A lifelong book lover, she applies her passion for reading to deeply explore the constantly evolving crypto world.

Michaela has no crypto positions and does not hold any crypto assets. This article is provided for informational purposes only and should not be construed as financial advice. The Shib Daily is the official publication of the Shiba Inu cryptocurrency project. Readers are encouraged to conduct their own research and consult with a qualified financial adviser before making any investment decisions.

THE SHIB IN YOUR SOCIAL FEED