智趣AI甄选
  • News
    • Newsflash
    • trade
    • Character
    • reporting
    • artifact
    • popularization of science
  • Tools
  • Company
  • China AI
    • AI algorithm filing
    • Generative AI filing
  • LM Evaluation
    • CompassRank list
    • FlagEval list
    • HELM List
    • SuperCLUE list
  • Free Tools
    • AI image generation
    • online translation
    • Online Dubbing
    • Photo ID production
    • One-click keying
    • Online Color Matching
    • AI search
    • Video Download
  • Top List
  • 简体中文 简体中文
    • News
      • All
      • Newsflash
      • trade
      • Character
      • artifact
      • reporting
      • popularization of science
    • Latest Collections
    • popular ranking
    • Product Selection
      • Hot Products
      • Domestic Selection
      • Overseas Selection
    • Category Recommendation
      • AI Office
      • AI chat
      • AI image
      • AI Design
      • AI Video
      • AI audio
      • AI writing
      • AI translation
      • AI programming
      • AI Digital Person
      • AI assistant
      • AI Law
    • Industrial Integration
      • Intelligent Manufacturing
      • Intelligent Agriculture
      • Smart Car
      • Smart Healthcare
      • Smart Finance
      • Intelligent Security
      • Smart Terminal
      • Smart Finance
      • Intelligent Energy
      • Intelligent Building
      • Intelligent Education
      • Intelligent Services
    • AI Company Selection
      • Industry Newcomers
      • Forbes AI 50 (2025)
      • Forbes AI 50 (2024)
      • Forbes China AI 50 (2024)
    • Large Model
    • Large model evaluation
    • Book Recommendation
    • Popular APP
    • Courses of Study
    • Open Source Project
    • 简体中文 简体中文

    Large model evaluation

    Total 7 articles 网址
    Hot ProductsDomestic SelectionOverseas SelectionCategory RecommendationIndustrial IntegrationCourses of StudyOpen Source ProjectLarge ModelLarge model evaluationAI Company SelectionLatest Collections
    Sorting
    releaseupdateViewsLike
    SuperCLUE

    SuperCLUE

    A comprehensive evaluation tool for Chinese big models, which truly reflects the general ability of big models through a multi-dimensional and multi-perspective evaluation system, and helps technical progress and industrialization development.
    07,1250
    Large model evaluation# Large Model Review
    OpenCompass

    OpenCompass

    An open-source big model capability assessment system designed to comprehensively and quantitatively assess the capabilities of big models in knowledge, language, understanding, reasoning, etc., and to drive iterative optimization of the models.
    06,5540
    Large model evaluation# Large Model Review
    HELM

    HELM

    Initiated by Stanford University, it aims to comprehensively assess the capabilities of big language models through multiple dimensions and scenarios in order to drive technological advancement and model optimization of the evaluation benchmark.
    06,2940
    Large model evaluation# Large Model Review
    MMBench

    MMBench

    A multimodal benchmarking framework designed to comprehensively assess and understand the performance of multimodal models in different scenarios, providing robust and reliable evaluation results through a well-designed evaluation process and labeled datasets.
    06,5740
    Large model evaluation# Multimodal Evaluation# Test Framework
    AGI-Eval评测社区

    AGI-Eval Review Community

    It is a comprehensive assessment platform focusing on evaluating the general ability of large models in human cognition and problem solving tasks, which is jointly created by well-known universities and organizations, providing diversified assessment methods and authoritative rankings to help the development and application of AI technology.
    06,1530
    Large model evaluation# Large Model Review
    C-Eval

    C-Eval

    The Chinese Basic Model Assessment Suite, jointly launched by Shanghai Jiao Tong University, Tsinghua University and the University of Edinburgh, covers objective questions assessed in multiple domains and difficulty levels, aiming to measure the ability of the Big Model in Chinese comprehension and reasoning.
    06,5920
    Large model evaluation# Model Evaluation
    FlagEval

    FlagEval

    A comprehensive, scientific, and fair big model evaluation system and open platform aims to help researchers assess the performance of basic models and training algorithms in an all-round way by providing multi-dimensional evaluation tools and methods.
    06,4970
    Large model evaluation# Large Model Review
    No more

    Latest Articles

    Popular Sites

    Tag Cloud

    智趣AI甄选
    Explore the forefront of AI, all in the intelligent AI selection! We have insights into the development prospects of the industry, select domestic and foreign products and applications, and provide rich learning resources. Industry integration cases help you understand trends, work with AI, and create the future together!

    Friendly Link Application Request for Inclusion statement denying or limiting responsibility privacy policy

    扫码加微信智趣AI甄选
    Scan the code and add WeChat
    Copyright © 2025 AIFun Selection 津ICP备20002714号 
    We've detected you might be speaking a different language. Do you want to change to:
    简体中文 简体中文
    简体中文 简体中文
    English English
    Change Language
    Close and do not switch language
    web address
    web addresswritingsapplianceworks