logo
0
0
WeChat Login
Azalea<22280294+hykilpikonna@users.noreply.github.com>
Correct misleading instructions in README

HiDream-I1

HiDream-I1 Demo

HiDream-I1 is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds.

For more features and to experience the full capabilities of our product, please visit https://vivago.ai/.

Project Updates

Models

We offer both the full version and distilled models. For more information about the models, please refer to the link under Usage.

NameScriptInference StepsHuggingFace repo
HiDream-I1-Fullinference.py50🤗 HiDream-I1-Full
HiDream-I1-Devinference.py28🤗 HiDream-I1-Dev
HiDream-I1-Fastinference.py16🤗 HiDream-I1-Fast

Quick Start

Please make sure you have installed Flash Attention. We recommend CUDA versions 12.4 for the manual installation.

pip install -r requirements.txt pip install -U flash-attn --no-build-isolation

Then you can run the inference scripts to generate images:

# For full model inference python ./inference.py --model_type full # For distilled dev model inference python ./inference.py --model_type dev # For distilled fast model inference python ./inference.py --model_type fast

NOTE

The inference script will try to automatically download meta-llama/Llama-3.1-8B-Instruct model files. You need to agree to the license of the Llama model on your HuggingFace account and login using huggingface-cli login in order to use the automatic downloader.

Gradio Demo

We also provide a Gradio demo for interactive image generation. You can run the demo with:

python gradio_demo.py

Evaluation Metrics

DPG-Bench

ModelOverallGlobalEntityAttributeRelationOther
PixArt-alpha71.1174.9779.3278.6082.5776.96
SDXL74.6583.2782.4380.9186.7680.41
DALL-E 383.5090.9789.6188.3990.5889.83
Flux.1-dev83.7985.8086.7989.9890.0489.90
SD3-Medium84.0887.9091.0188.8380.7088.68
Janus-Pro-7B84.1986.9088.9089.4089.3289.48
CogView4-6B85.1383.8590.3591.1791.1487.29
HiDream-I185.8976.4490.2289.4893.7491.83

GenEval

ModelOverallSingle Obj.Two Obj.CountingColorsPositionColor attribution
SDXL0.550.980.740.390.850.150.23
PixArt-alpha0.480.980.500.440.800.080.07
Flux.1-dev0.660.980.790.730.770.220.45
DALL-E 30.670.960.870.470.830.430.45
CogView4-6B0.730.990.860.660.790.480.58
SD3-Medium0.740.990.940.720.890.330.60
Janus-Pro-7B0.800.990.890.590.900.790.66
HiDream-I10.831.000.980.790.910.600.72

HPSv2.1 benchmark

ModelAveragedAnimationConcept-artPaintingPhoto
Stable Diffusion v2.026.3827.0926.0225.6826.73
Midjourney V630.2932.0230.2929.7429.10
SDXL30.6432.8431.3630.8627.48
Dall-E331.4432.3931.0931.1831.09
SD331.5332.6031.8232.0629.62
Midjourney V532.3334.0532.4732.2430.56
CogView4-6B32.3133.2332.6032.8930.52
Flux.1-dev32.4733.8732.2732.6231.11
stable cascade32.9534.5833.1333.2930.78
HiDream-I133.8235.0533.7433.8832.61

License

The code in this repository and the HiDream-I1 models are licensed under MIT License.