
What is Paper2Any?
Paper2Any is developed by the DCAI research group at Peking University.Automated Multimodal Assistance Platformbased onDataFlow-Agent FrameworkBuilt to address the efficiency pain points of converting text into visual materials in scientific research and office settings. Its core logic is:Understand the logical structure of user-input text, automatically generate structured visual elements (such as architecture diagrams, flowcharts, and data charts), and output directly editable PPT and SVG files.This design breaks through the limitations of traditional AI tools that produce static images. Users can perform modular drag-and-drop operations, text replacement, and style adjustments on generated elements, enabling flexible WYSIWYG editing. Its core functionalities include Paper2Figure (automated scientific illustration), Paper2PPT (long-document-to-presentation conversion), PDF2PPT (dynamic transformation of static documents), and PPTPolish (PPT beautification and refinement), covering the entire content production workflow.
Paper2Any significantly lowers the design barrier, enabling researchers to efficiently create paper illustrations and presentation materials without learning complex software. Professionals can also swiftly transform lengthy documents into professional PowerPoint presentations, making it a powerful tool for boosting productivity.
Paper2Any's Key Features
- Paper2Figure: Automated Scientific Illustration
- Input SupportPDF papers, text descriptions, sketches, screenshots, etc.
- Output capacity::
- Automatically generate architecture diagrams and technology roadmaps, supporting bilingual annotations in Chinese and English;
- Extract experimental data tables and convert them into comparative bar charts and line charts.
- The generated SVG and PPTX files contain elements that can be edited independently (such as adjusting line thickness or modifying icon colors).
- Paper2PPT: Convert Structured Documents to Presentations
- Input SupportUpload PDF papers, paste long text, or enter research topics.
- Output capacity::
- Automatically parses document structure, extracts background information, methodology, and key charts to generate editable PowerPoint presentations.
- Supports customizable slide page numbers, styles (academic/business), and Chinese/English language options;
- First to support ultra-longPPT Generation(For reports over 40 pages), this addresses the issues of “font oddities” and “stiff expression” when large models generate PowerPoint presentations.
- PDF2PPT: Bringing Static Documents to Life
- Technical Principles: UtilizationMinerUSAM ModelPerform high-precision parsing of PDF layouts to convert locked pages back into editable PPTX files.
- core functionality::
- “Go-Text Retention” Technology: Restores the background in areas where text overlaps while preserving the original image's visual effects.
- Supports batch processing, ideal for quickly converting meeting handouts and research reports into presentation materials.
- PPT Polish: Enhancing and Refining PowerPoint Presentations
- Automated Optimization::
- Add tech-inspired backgrounds, visual icons, and logical diagrams;
- Adjust the layout to enhance professionalism and a human touch;
- Supports page-by-page prompt editing for fine-tuning and beautification.
- Automated Optimization::
Paper2Any Use Cases
- Illustrations for Research Papers and Presentations
- point of painDrawing architecture diagrams requires repeated adjustments to lines, while visualizing experimental data necessitates manually creating charts.
- prescriptionUpload your thesis PDF to automatically generate topic-aligned architecture diagrams and technical roadmaps; extract experimental data tables to create comparison charts and insert them into your PPT; use PPTPolish to add academic-style backgrounds and complete your presentation materials with one click.
- Converting Long Workplace Documents into Presentations
- point of painConverting a 20-page product white paper into a presentation PowerPoint takes half a day, and the formatting style is inconsistent.
- prescriptionUpload your PDF white paper, select the “Business Style” template with 15-page slide settings; the system automatically extracts core sections and data charts to generate a structured PowerPoint presentation; PPTPolish unifies fonts and color schemes, delivering a professional presentation in just 10 minutes.
- Cross-language Academic Exchange
- point of painInternational conferences require both Chinese and English versions of PowerPoint presentations, and manual translation and formatting are inefficient.
- prescriptionEnter your Chinese research topic to generate a draft Chinese PowerPoint presentation; switch to English mode within the system to automatically translate text and adjust formatting; export bilingual Chinese-English PowerPoint versions to ensure consistent terminology.
How to use Paper2Any?
- On-premises deployment (recommended for developers)
- Steps::
- Visit the GitHub repository (OpenDCAI/Paper2Any), download the code;
- Launch the web frontend by following the instructions in the Readme document; supports installation on Linux systems.
- Upload local files (PDF/text/sketches) or directly enter your research topic.
- Steps::
- Quick Web Experience
- Steps::
- Visit the online platform (http://dcai-paper2any.nas.cpolar.cn/);
- Drag and drop files or paste text, then select the output format (PPT/SVG);
- View generation progress in real time and download editable files.
- Steps::
- Advanced Feature Customization
- Prompt OptimizationWhen generating PPTs, adjust the style using prompts (e.g., “Add a tech-savvy feel,” “Simplify charts”).
- Modular EditingRight-click the chart element in PowerPoint, select “Unlock,” then freely drag or replace it.
- batch fileUpload multiple papers to generate architecture diagrams and data charts in bulk.
Recommended Reasons
- technological leadership
- Full-Path AutomationFrom logical parsing to visual generation, covering the entire content production workflow while reducing manual intervention;
- multimodal interactionSupports multiple input types including text, PDF, and sketches, outputs editable PPT and SVG files, and flexibly adapts to various scenarios.
- Academic FriendlinessThe Chinese text is expressed naturally, with formatting adhering to academic standards and avoiding any “AI-generated traces.”
- Revolutionary Efficiency
- Reduced time costsScientific illustration time reduced from 2 hours to 5 minutes; Long document conversion to PowerPoint compressed from half a day to 10 minutes.
- Design barriers eliminatedUsers can focus on content logic rather than formatting adjustments without needing to learn Visio or Illustrator.
- Ecological openness
- Open Source Community SupportThe GitHub repository provides complete code and documentation, enabling developers to perform secondary development.
- Continuous Iteration PlanFuture updates will include features such as paper revision support (Paper2Rebuttal) and innovation point generation (Paper2Idea).
data statistics
Relevant Navigation

An open-source desktop browser based on the Firefox engine, featuring vertical tabs, workspaces, and split-screen views, emphasizing privacy protection and a modern browsing experience focused on efficiency and concentration.

Kuse AI
The AI-based knowledge management and collaboration tool aims to enhance users' information processing and creative expression through infinite canvas and multimodal interaction.

ChatGLM-6B
An open source generative language model developed by Tsinghua University, designed for Chinese chat and dialog tasks, demonstrating powerful Chinese natural language processing capabilities.

Gemma 3n
Google introduced a lightweight open source large language model , both high performance and easy to deploy , suitable for local development and multi-scenario applications .

SAM 3D
Meta open source revolutionary single-image 3D generation model, support one-click from 2D photos to generate high-fidelity, interactive 3D models, covering the object/human body scene, empowering e-commerce, AR/VR, film and television, and other multi-industry cost reduction and efficiency.

HunyuanImage2.1
Tencent launched the open source raw image model, which natively supports 2K HD raw images, accurately parses complex semantics, and can efficiently generate high-quality images with Chinese and English fusion.

Figma AI
The intelligent design assistant integrated into the Figma platform dramatically improves design efficiency and collaboration through natural language generation, layer optimization and content filling.
Ask ChatGPT

Meetily
AI-based meeting assistant that captures meeting audio and transcribes it to text in real-time, automatically generates meeting summaries, supports multiple languages and formats, while focusing on privacy protection and local processing.
No comments...
