alibaba · Juude · Jan 17, 2025 · Jan 23, 2025 · Jan 23, 2025 · Jan 23, 2025
diff --git a/README.md b/README.md
@@ -4,6 +4,15 @@
 
 [MNN Homepage](http://www.mnn.zone)
 
+## News 🔥
+- [2025/01/23] We released our full multimodal LLM Android App:[MNN-LLM-Android](./project/android/apps/MnnLlmApp/README.md). including text-to-text, image-to-text, audio-to-text, and text-to-image generation.
+<p align="center">
+  <img width="20%" alt="Icon"  src="./project/android/apps/MnnLlmApp/assets/image_home.jpg" style="margin: 0 10px;">
+  <img width="20%" alt="Icon" src="./project/android/apps/MnnLlmApp/assets/image_diffusion.jpg" style="margin: 0 10px;">
+  <img width="20%" alt="Icon" src="./project/android/apps/MnnLlmApp/assets/image_sound.jpg" style="margin: 0 10px;">
+  <img width="20%" alt="Icon" src="./project/android/apps/MnnLlmApp/assets/image_image.jpg" style="margin: 0 10px;">
+</p>
+
 ## Intro
 MNN is a highly efficient and lightweight deep learning framework. It supports inference and training of deep learning models and has industry-leading performance for inference and training on-device. At present, MNN has been integrated into more than 30 apps of Alibaba Inc, such as Taobao, Tmall, Youku, DingTalk, Xianyu, etc., covering more than 70 usage scenarios such as live broadcast, short video capture, search recommendation, product searching by image, interactive marketing, equity distribution, security risk control. In addition, MNN is also used on embedded devices, such as IoT.
 

diff --git a/project/android/apps/MnnLlmApp/README.md b/project/android/apps/MnnLlmApp/README.md
@@ -0,0 +1,89 @@
+# MNN-LLM Android App
+[中文版本](./README_CN.md)
+## Introduction
+This is our full multimodal language model (LLM) Android app
+
+<p align="center">
+  <img width="20%" alt="Icon"  src="./assets/image_home.jpg" style="margin: 0 10px;">
+  <img width="20%" alt="Icon" src="./assets/image_diffusion.jpg" style="margin: 0 10px;">
+  <img width="20%" alt="Icon" src="./assets/image_sound.jpg" style="margin: 0 10px;">
+  <img width="20%" alt="Icon" src="./assets/image_image.jpg" style="margin: 0 10px;">
+</p>
+
+
+### Features
+
++ **Multimodal Support:** Enables functionality across diverse tasks, including text-to-text, image-to-text, audio-to-text, and text-to-image generation (via diffusion models).
+
++ **CPU Inference Optimization:** MNN-LLM demonstrates exceptional performance in CPU benchmarking in Android, achieving prefill speed improvements of 8.6x over llama.cpp and 20.5x over fastllm, with decoding speeds that are 2.3x and 8.9x faster, respectively. the following is a comparison between llama.cpp and MNN-LLM on Android inferencing qwen-7b.
+<p align="center">
+  <img width="60%"   src="./assets/compare.gif" style="margin: 0 10px;">
+</p>
+
++ **Broad Model Compatibility:** Supports multiple leading model providers, such as Qwen, Gemma, Llama (including TinyLlama and MobileLLM), Baichuan, Yi, DeepSeek, InternLM, Phi, ReaderLM, and Smolm.
+
++ **Privacy First:** Runs entirely on-device, ensuring complete data privacy with no information uploaded to external servers.
+
+
+# How to Use
++ you can download the app from [Releases](#releases) or [build it yourself](#development);
++ After installing the application, you can browse all supported models, download them, and interact with them directly within the app.;
++ Additionally, you can access your chat history in the sidebar and revisit previous conversations seamlessly.
+
+ !!!warning!!! This version has been tested exclusively on the OnePlus 13 and Xiaomi 14 Ultra, Due to the demanding performance requirements of large language models (LLMs), many budget or low-spec devices may experience issues such as slow inference speeds, application instability, or even failure to run entirely. and its stability on other devices cannot be guaranteed. If you encounter any issues, please feel free to open an issue for assistance.
+
+
+# Development 
++ Clone the repository:
+  ```shell
+    git clone https://github.com/alibaba/MNN.git
+  ```
++ Build library:
+  ```shell
+  cd project/android
+  mkdir build_64
+  ../build_64.sh "-DMNN_LOW_MEMORY=true -DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true -DMNN_SUPPORT_TRANSFORMER_FUSE=true -DMNN_ARM82=true -DMNN_USE_LOGCAT=true -DMNN_OPENCL=true -DLLM_SUPPORT_VISION=true -DMNN_BUILD_OPENCV=true -DMNN_IMGCODECS=true -DLLM_SUPPORT_AUDIO=true -DMNN_BUILD_AUDIO=true -DMNN_BUILD_DIFFUSION=ON -DMNN_SEP_BUILD=ON"
+  ```
++ copy to llm android app project
+  ```shell
+  find . -name "*.so" -exec cp {} ../apps/MnnLlmApp/app/src/main/jniLibs/arm64-v8a/  \;
+  ```
++ build android app project and install
+  ```shell
+  cd ../apps/MnnLlmApp/
+  ./gradlew installDebug
+  ```
+
+# Releases
+## Version 0.1
++ Click here to [download](https://meta.alicdn.com/data/mnn/mnn_llm_app_debug_0_1.apk)
++ this is our first public released version; you can :
+  + search all our supported models, download  and chat with it in the app; 
+  + diffusion model:
+    + stable-diffusion-v1-5
+  + audio model:
+    + qwen2-audio-7b
+  + visual models:
+    + qwen-vl-chat
+    + qwen2-vl-2b
+    + qwen2-vl-7b
+
+
+
+
+# About MNN-LLM
+MNN-LLM is a versatile inference framework designed to optimize and accelerate the deployment of large language models on both mobile devices and local PCs, addressing challenges like high memory consumption and computational costs through innovations such as model quantization, hybrid storage, and hardware-specific optimizations. In CPU benchmarking, MNN-LLM excels, achieving prefill speed boosts of 8.6x over llama.cpp and 20.5x over fastllm, complemented by decoding speeds that are 2.3x and 8.9x faster, respectively. In
+GPU-based assessments, MNN-LLM’s performance slightly declines
+compared to MLC-LLM, particularly when using Qwen2-7B with shorter prompts, due to MLC-LLM’s advantageous symmetric quantization technique. MNN-LLM excels, achieving up to 25.3x faster prefill and 7.1x faster decoding than llama.cpp, and 2.8x and 1.7x improvements over MLC-LLM, respectively.
+ For more detailed information, please refer to the paper:[MNN-LLM: A Generic Inference Engine for Fast Large LanguageModel Deployment on Mobile Devices](https://dl.acm.org/doi/pdf/10.1145/3700410.3702126) 
+
+
+# Acknowledgements
+This project is built upon the following open-source projects:
+
++ [progress-dialog](https://github.com/techinessoverloaded/progress-dialog)
++ [okhttp](https://github.com/square/okhttp)
++ [retrofit](https://github.com/square/retrofit)
++ [Android-SpinKit](https://github.com/ybq/Android-SpinKit)
++ [expandable-fab](https://github.com/nambicompany/expandable-fab)
++ [Android-Wave-Recorder](https://github.com/squti/Android-Wave-Recorder)
diff --git a/project/android/apps/MnnLlmApp/README_CN.md b/project/android/apps/MnnLlmApp/README_CN.md
@@ -0,0 +1,86 @@
+# MNN 大模型 Android App
+
+## 简介
+这是我们的全功能多模态语言模型（LLM）安卓应用。
+
+<p align="center">
+  <img width="20%" alt="Icon"  src="./assets/image_home.jpg" style="margin: 0 10px;">
+  <img width="20%" alt="Icon" src="./assets/image_diffusion.jpg" style="margin: 0 10px;">
+  <img width="20%" alt="Icon" src="./assets/image_sound.jpg" style="margin: 0 10px;">
+  <img width="20%" alt="Icon" src="./assets/image_image.jpg" style="margin: 0 10px;">
+</p>
+
+
+### 功能亮点
+
++ **多模态支持：** 提供多种任务功能，包括文本生成文本、图像生成文本、音频转文本及文本生成图像（基于扩散模型）。
+
++ **CPU推理优化：** 在安卓平台上，MNN-LLM展现了卓越的CPU性能，预填充速度相较于llama.cpp提高了8.6倍，相较于fastllm提升了20.5倍，解码速度分别快了2.3倍和8.9倍。下图为 llama.cpp 与 MNN-LLM 与 llama.cpp 的比较。
+<p align="center">
+  <img width="60%"   src="./assets/compare.gif" style="margin: 0 10px;">
+</p>
+
+
++ **广泛的模型兼容性：** 支持多种领先的模型提供商，包括Qwen、Gemma、Llama（涵盖TinyLlama与MobileLLM）、Baichuan、Yi、DeepSeek、InternLM、Phi、ReaderLM和Smolm。
+
++ **本地运行：** 完全在设备本地运行，确保数据隐私，无需将信息上传至外部服务器。
+
+
+# 使用说明
+您可以通过 [Releases](#releases) 下载应用，或者 自行构建(#开发)。
++ 安装应用后，您可以浏览所有支持的模型，下载所需模型，并直接在应用内与模型交互。
++ 此外，您可以通过侧边栏访问聊天历史，轻松查看和管理之前的对话记录。
+
+ !!!warning!!!  此版本目前仅在 OnePlus 13 和 小米 14 Ultra 上进行了测试。由于大型语言模型（LLM）对设备性能要求较高，许多低配置设备可能会遇到以下问题：推理速度缓慢、应用不稳定甚至无法运行。对于其他设备的稳定性无法保证。如果您在使用过程中遇到问题，请随时提交问题以获取帮助。
+
+
+# 开发 
++ 克隆代码库：
+  ```shell
+    git clone https://github.com/alibaba/MNN.git
+  ```
++ 构建库：
+  ```shell
+  cd project/android
+  mkdir build_64
+  ../build_64.sh "-DMNN_LOW_MEMORY=true -DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true -DMNN_SUPPORT_TRANSFORMER_FUSE=true -DMNN_ARM82=true -DMNN_USE_LOGCAT=true -DMNN_OPENCL=true -DLLM_SUPPORT_VISION=true -DMNN_BUILD_OPENCV=true -DMNN_IMGCODECS=true -DLLM_SUPPORT_AUDIO=true -DMNN_BUILD_AUDIO=true -DMNN_BUILD_DIFFUSION=ON -DMNN_SEP_BUILD=ON"
+  ```
++ 复制到 LLM Android 应用项目：
+  ```shell
+  find . -name "*.so" -exec cp {} ../apps/MnnLlmApp/app/src/main/jniLibs/arm64-v8a/  \;
+  ```
++ 构建 Android 应用项目并安装：
+  ```shell
+  cd ../apps/MnnLlmApp/
+  ./gradlew installDebug
+  ```
+
+# Releases
+## Version 0.1
++ 点击这里[下载](https://meta.alicdn.com/data/mnn/mnn_llm_app_debug_0_1.apk)
++ 这是我们的首个公开发布版本，您可以：
+  + 搜索我们支持的所有模型，在应用中下载并与其聊天；
+  + 生成扩散模型：
+    + stable-diffusion-v1-5
+  + 音频模型：
+    + qwen2-audio-7b
+  + 视觉模型：
+    + qwen-vl-chat
+    + qwen2-vl-2b
+    + qwen2-vl-7b
+
+
+
+
+# 关于 MNN-LLM
+MNN-LLM 是一个多功能的推理框架，旨在优化和加速大语言模型在移动设备和本地 PC 上的部署。通过模型量化、混合存储和硬件特定优化等创新措施，解决高内存消耗和计算成本等挑战。在 CPU 基准测试中，MNN-LLM 表现优异，其预填充速度比 llama.cpp 快 8.6 倍，比 fastllm 快 20.5 倍，同时解码速度分别快 2.3 倍和 8.9 倍。在基于 GPU 的评估中，由于 MLC-LLM 的对称量化技术优势，MNN-LLM 的性能在使用 Qwen2-7B 进行较短提示时略有下降。MNN-LLM 的预填充速度比 llama.cpp 快 25.3 倍，解码速度快 7.1 倍，相较于 MLC-LLM 也分别提高 2.8 倍和 1.7 倍。如需更详细的信息，请参考论文：[MNN-LLM: A Generic Inference Engine for Fast Large LanguageModel Deployment on Mobile Devices](https://dl.acm.org/doi/pdf/10.1145/3700410.3702126) 
+
+
+# 致谢
+该项目基于以下开源项目：
++ [progress-dialog](https://github.com/techinessoverloaded/progress-dialog)
++ [okhttp](https://github.com/square/okhttp)
++ [retrofit](https://github.com/square/retrofit)
++ [Android-SpinKit](https://github.com/ybq/Android-SpinKit)
++ [expandable-fab](https://github.com/nambicompany/expandable-fab)
++ [Android-Wave-Recorder](https://github.com/squti/Android-Wave-Recorder)
diff --git a/project/android/apps/MnnLlmApp/app/.gitignore b/project/android/apps/MnnLlmApp/app/.gitignore
@@ -0,0 +1 @@
+/build
diff --git a/project/android/apps/MnnLlmApp/app/build.gradle b/project/android/apps/MnnLlmApp/app/build.gradle
@@ -0,0 +1,85 @@
+plugins {
+    id 'com.android.application'
+    id 'org.jetbrains.kotlin.android'
+}
+
+android {
+    namespace 'com.alibaba.mnnllm.android'
+    compileSdk 34
+
+    defaultConfig {
+        applicationId "com.alibaba.mnnllm.android"
+        minSdk 26
+        targetSdk 34
+        versionCode 1
+        versionName "0.1"
+
+        testInstrumentationRunner "androidx.test.runner.AndroidJUnitRunner"
+        externalNativeBuild {
+            cmake {
+                cppFlags '-std=c++17'
+            }
+        }
+
+        buildTypes {
+            release {
+                minifyEnabled false
+                proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'
+            }
+        }
+
+        compileOptions {
+            sourceCompatibility JavaVersion.VERSION_1_8
+            targetCompatibility JavaVersion.VERSION_1_8
+        }
+
+        kotlinOptions {
+            jvmTarget = "1.8"
+        }
+
+        ndk {
+            //noinspection ChromeOsAbiSupport
+            abiFilters "arm64-v8a" // Include only arm64-v8a
+        }
+    }
+
+    buildTypes {
+        release {
+            minifyEnabled false
+            proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'
+        }
+    }
+    compileOptions {
+        sourceCompatibility JavaVersion.VERSION_1_8
+        targetCompatibility JavaVersion.VERSION_1_8
+    }
+    externalNativeBuild {
+        cmake {
+            path file('src/main/cpp/CMakeLists.txt')
+            version '3.22.1'
+        }
+    }
+    buildFeatures {
+        viewBinding true
+    }
+}
+
+dependencies {
+//    implementation 'uk.co.chrisjenx:calligraphy:2.2.0'
+    implementation "org.jetbrains.kotlin:kotlin-stdlib-jdk8"
+    implementation 'androidx.appcompat:appcompat:1.7.0'
+    implementation 'com.google.android.material:material:1.12.0'
+    implementation 'androidx.constraintlayout:constraintlayout:2.2.0'
+    implementation 'com.github.techinessoverloaded:progress-dialog:1.5.1'
+    implementation 'com.squareup.okhttp3:okhttp:4.12.0'
+    implementation 'com.squareup.retrofit2:converter-gson:2.9.0'
+    implementation 'com.squareup.retrofit2:retrofit:2.9.0'
+    implementation 'com.squareup.okhttp3:logging-interceptor:4.9.3'
+    implementation 'com.squareup.retrofit2:converter-scalars:2.9.0'
+    implementation 'com.github.ybq:Android-SpinKit:1.4.0'
+    implementation 'com.nambimobile.widgets:expandable-fab:1.2.1'
+    implementation 'com.github.squti:Android-Wave-Recorder:2.0.1'
+    testImplementation 'junit:junit:4.13.2'
+    androidTestImplementation 'androidx.test.ext:junit:1.2.1'
+    androidTestImplementation 'androidx.test.espresso:espresso-core:3.6.1'
+}
diff --git a/project/android/apps/MnnLlmApp/app/proguard-rules.pro b/project/android/apps/MnnLlmApp/app/proguard-rules.pro
@@ -0,0 +1,21 @@
+# Add project specific ProGuard rules here.
+# You can control the set of applied configuration files using the
+# proguardFiles setting in build.gradle.
+#
+# For more details, see
+#   http://developer.android.com/guide/developing/tools/proguard.html
+
+# If your project uses WebView with JS, uncomment the following
+# and specify the fully qualified class name to the JavaScript interface
+# class:
+#-keepclassmembers class fqcn.of.javascript.interface.for.webview {
+#   public *;
+#}
+
+# Uncomment this to preserve the line number information for
+# debugging stack traces.
+#-keepattributes SourceFile,LineNumberTable
+
+# If you keep the line number information, uncomment this to
+# hide the original source file name.
+#-renamesourcefileattribute SourceFile
diff --git a/...App/app/src/androidTest/java/com/alibaba/mnnllm/android/demo/ExampleInstrumentedTest.java b/...App/app/src/androidTest/java/com/alibaba/mnnllm/android/demo/ExampleInstrumentedTest.java
@@ -0,0 +1,26 @@
+package com.alibaba.mnnllm.android.demo;
+
+import android.content.Context;
+
+import androidx.test.platform.app.InstrumentationRegistry;
+import androidx.test.ext.junit.runners.AndroidJUnit4;
+
+import org.junit.Test;
+import org.junit.runner.RunWith;
+
+import static org.junit.Assert.*;
+
+/**
+ * Instrumented test, which will execute on an Android device.
+ *
+ * @see <a href="http://d.android.com/tools/testing">Testing documentation</a>
+ */
+@RunWith(AndroidJUnit4.class)
+public class ExampleInstrumentedTest {
+    @Test
+    public void useAppContext() {
+        // Context of the app under test.
+        Context appContext = InstrumentationRegistry.getInstrumentation().getTargetContext();
+        assertEquals("com.alibaba.mnnllm.android.demo", appContext.getPackageName());
+    }
+}
diff --git a/project/android/apps/MnnLlmApp/app/src/main/AndroidManifest.xml b/project/android/apps/MnnLlmApp/app/src/main/AndroidManifest.xml
@@ -0,0 +1,57 @@
+<?xml version="1.0" encoding="utf-8"?>
+<manifest xmlns:android="http://schemas.android.com/apk/res/android"
+    xmlns:tools="http://schemas.android.com/tools">
+    <uses-permission android:name="android.permission.INTERNET"/>
+    <uses-permission android:name="android.permission.FOREGROUND_SERVICE"/>
+    <uses-permission android:name="android.permission.FOREGROUND_SERVICE_DATA_SYNC"/>
+    <uses-permission android:name="android.permission.POST_NOTIFICATIONS"/>
+    <uses-permission android:name="android.permission.READ_MEDIA_IMAGES"
+        tools:ignore="SelectedPhotoAccess" />
+    <uses-permission android:name="android.permission.READ_MEDIA_AUDIO"/>
+    <uses-permission android:name="android.permission.RECORD_AUDIO" />
+
+    <application
+        android:allowBackup="true"
+        android:icon="@drawable/ic_launcher"
+        android:label="@string/app_name"
+        android:roundIcon="@drawable/ic_launcher"
+        android:supportsRtl="true"
+        android:name="com.alibaba.mnnllm.android.DemoApplication"
+        android:theme="@style/Theme.MnnLlmApp"
+        tools:targetApi="31">
+        <uses-native-library android:name="libOpenCL.so"
+            android:required="true"/>
+        <provider
+            android:name="androidx.core.content.FileProvider"
+            android:authorities="${applicationId}.fileprovider"
+            android:exported="false"
+            android:grantUriPermissions="true">
+            <meta-data
+                android:name="android.support.FILE_PROVIDER_PATHS"
+                android:resource="@xml/file_paths" />
+        </provider>
+        <activity android:name="com.alibaba.mnnllm.android.chat.ChatActivity"
+            android:exported="true"
+            android:configChanges="orientation|screenSize"
+            >
+            <intent-filter>
+                <action android:name="android.intent.action.MAIN" />
+                <category android:name="android.intent.category.DEFAULT" />
+            </intent-filter>
+        </activity>
+
+        <service android:name="com.alibaba.mls.api.download.DownlodForegroundService"
+            android:foregroundServiceType="dataSync"
+            />
+
+        <activity android:name=".MainActivity"
+            android:exported="true"
+            android:configChanges="orientation|screenSize">
+            <intent-filter>
+                <action android:name="android.intent.action.MAIN" />
+                <category android:name="android.intent.category.LAUNCHER" />
+            </intent-filter>
+        </activity>
+    </application>
+
+</manifest>