diff --git a/index.md b/index.md index ce1bc28..8218949 100644 --- a/index.md +++ b/index.md @@ -191,9 +191,11 @@ Here we will keep track of the latest AI Game Development Tools, including LLM, | [KwaiAgents](https://github.com/KwaiKEG/KwaiAgents) | A generalized information-seeking agent system with Large Language Models (LLMs). |[arXiv](https://arxiv.org/abs/2312.04889) | | Agent | | [LangChain](https://github.com/langchain-ai/langchain) | Get your LLM application from prototype to production. | | | Agent | | [Langflow](https://github.com/logspace-ai/langflow) | Langflow is a UI for LangChain, designed with react-flow to provide an effortless way to experiment and prototype flows. | | | Agent | +| [LangGraph Studio](https://github.com/langchain-ai/langgraph-studio) | LangGraph Studio offers a new way to develop LLM applications by providing a specialized agent IDE that enables visualization, interaction, and debugging of complex agentic applications. | | | Agent | | [LARP](https://github.com/MiAO-AI-Lab/LARP) | Language-Agent Role Play for open-world games. |[arXiv](https://arxiv.org/abs/2312.17653) | | Agent | | [LLama Agentic System](https://github.com/meta-llama/llama-agentic-system) | Agentic components of the Llama Stack APIs. | | | Agent | | [LlamaIndex](https://github.com/run-llama/llama_index) | LlamaIndex is a data framework for your LLM application. | | | Agent | +| [MindSearch](https://github.com/InternLM/MindSearch) | 🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT). | | | Agent | | [Mixture of Agents (MoA)](https://github.com/togethercomputer/MoA) | Mixture-of-Agents Enhances Large Language Model Capabilities. |[arXiv](https://arxiv.org/abs/2406.04692) | | Agent | | [Moonlander.ai](https://www.moonlander.ai/) | Start building 3D games without any coding using generative AI. | | | Framework | | [MuG Diffusion](https://github.com/Keytoyze/Mug-Diffusion) | MuG Diffusion is a charting AI for rhythm games based on Stable Diffusion (one of the most powerful AIGC models) with a large modification to incorporate audio waves. | | | Game | @@ -266,6 +268,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM, | [AutoStudio](https://github.com/donahowe/AutoStudio) | Crafting Consistent Subjects in Multi-turn Interactive Image Generation. |[arXiv](https://arxiv.org/abs/2406.01388) | | Image | | [Blender-ControlNet](https://github.com/coolzilj/Blender-ControlNet) | Using ControlNet right in Blender. | | Blender | Image | | [BriVL](https://github.com/BAAI-WuDao/BriVL) | Bridging Vision and Language Model. |[arXiv](https://arxiv.org/abs/2103.06561) | | Image | +| [CatVTON](https://github.com/Zheng-Chong/CatVTON) | CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models. |[arXiv](https://arxiv.org/abs/2407.15886) | | Image | | [CLIPasso](https://github.com/yael-vinker/CLIPasso) | A method for converting an image of an object to a sketch, allowing for varying levels of abstraction. |[arXiv](https://arxiv.org/abs/2202.05822) | | Image | | [ClipDrop](https://clipdrop.co/) | Create stunning visuals in seconds. | | | Image | | [ComfyUI](https://github.com/comfyanonymous/ComfyUI) | A powerful and modular stable diffusion GUI with a graph/nodes interface. | | | Image | @@ -283,6 +286,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM, | [Draw Things](https://drawthings.ai/) | AI- assisted image generation in Your Pocket. | | | Image | | [DWPose](https://github.com/idea-research/dwpose) | Effective Whole-body Pose Estimation with Two-stages Distillation. |[arXiv](https://arxiv.org/abs/2307.15880) | | Image | | [EasyPhoto](https://github.com/aigc-apps/sd-webui-EasyPhoto) | Your Smart AI Photo Generator. | | | Image | +| [Flux](https://github.com/black-forest-labs/flux) | This repo contains minimal inference code to run text-to-image and image-to-image with our Flux latent rectified flow transformers. | | | Image | | [Follow-Your-Click](https://github.com/mayuelala/FollowYourClick) | Open-domain Regional Image Animation via Short Prompts. |[arXiv](https://arxiv.org/abs/2403.08268) | | Image | | [Fooocus](https://github.com/lllyasviel/Fooocus) | Focus on prompting and generating. | | | Image | | [GIFfusion](https://github.com/DN6/giffusion) | Create GIFs and Videos using Stable Diffusion. | | | Image | @@ -320,6 +324,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM, | [RPG-DiffusionMaster](https://github.com/YangLing0818/RPG-DiffusionMaster) | Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG). | | | Image | | [SEED-Story](https://github.com/TencentARC/SEED-Story) | SEED-Story: Multimodal Long Story Generation with Large Language Model. |[arXiv](https://arxiv.org/abs/2407.08683) | | Image | | [Segment Anything](https://segment-anything.com/) | Segment Anything Model (SAM): a new AI model from Meta AI that can "cut out" any object , in any image , with a single click. |[arXiv](https://arxiv.org/abs/2304.02643) | | Image | +| [Segment Anything Model 2 (SAM 2)](https://github.com/facebookresearch/segment-anything-2) | SAM 2: Segment Anything in Images and Videos. |[arXiv](https://arxiv.org/abs/2408.00714) | | Image | | [sd-webui-controlnet](https://github.com/Mikubill/sd-webui-controlnet) | WebUI extension for ControlNet. | | | Image | | [SDXL-Lightning](https://huggingface.co/ByteDance/SDXL-Lightning) | Progressive Adversarial Diffusion Distillation. |[arXiv](https://arxiv.org/abs/2402.13929) | | Image | | [SDXS](https://github.com/IDKiro/sdxs) | Real-Time One-Step Latent Diffusion Models with Image Conditions. | | | Image | @@ -383,6 +388,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM, | Source | Description | Paper | Game Engine | Type | | :------------------------------------------------------------------------------------------ | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :-----------: | :-----------: | :-------: | | [Anything-3D](https://github.com/Anything-of-anything/Anything-3D) | Segment-Anything + 3D. Let's lift the anything to 3D. |[arXiv](https://arxiv.org/abs/2304.10261) | | Model | +| [Any2Point](https://github.com/Ivan-Tang-3D/Any2Point) | Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding. |[arXiv](https://arxiv.org/abs/2404.07989) | | 3D | | [BlenderGPT](https://github.com/gd3kr/BlenderGPT) | Use commands in English to control Blender with OpenAI's GPT-4. | | Blender | Model | | [Blender-GPT](https://github.com/TREE-Ind/Blender-GPT) | An all-in-one Blender assistant powered by GPT3/4 + Whisper integration. | | Blender | Model | | [Blockade Labs](https://www.blockadelabs.com/) | Digital alchemy is real with Skybox Lab - the ultimate AI-powered solution for generating incredible 360° skybox experiences from text prompts. | | | Model | @@ -401,6 +407,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM, | [GaussianDreamer](https://github.com/hustvl/GaussianDreamer) | Fast Generation from Text to 3D Gaussian Splatting with Point Cloud Priors. |[arXiv](https://arxiv.org/abs/2310.08529) | | 3D | | [GenieLabs](https://www.genielabs.tech/) | Empower your game with AI-UGC. | | | 3D | | [HiFA](https://hifa-team.github.io/HiFA-site/) | High-fidelity Text-to-3D with advance Diffusion guidance. | | | Model | +| [HoloDreamer](https://github.com/zhouhyOcean/HoloDreamer) | HoloDreamer: Holistic 3D Panoramic World Generation from Text Descriptions. |[arXiv](https://arxiv.org/abs/2407.15187) | | 3D | | [Infinigen](https://github.com/princeton-vl/infinigen) | Infinite Photorealistic Worlds using Procedural Generation. |[arXiv](https://arxiv.org/abs/2306.09310) | | 3D | | [Instruct-NeRF2NeRF](https://instruct-nerf2nerf.github.io/) | Editing 3D Scenes with Instructions. |[arXiv](https://arxiv.org/abs/2303.12789) | | Model | | [Interactive3D](https://github.com/interactive-3d/interactive3d) | Create What You Want by Interactive 3D Generation. |[arXiv](https://arxiv.org/abs/2404.16510) | | 3D | @@ -419,6 +426,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM, | [PAniC-3D](https://github.com/shuhongchen/panic3d-anime-reconstruction) | Stylized Single-view 3D Reconstruction from Portraits of Anime Characters. |[arXiv](https://arxiv.org/abs/2303.14587) | | Model | | [Point·E](https://github.com/openai/point-e) | Point cloud diffusion for 3D model synthesis. | | | Model | | [ProlificDreamer](https://ml.cs.tsinghua.edu.cn/prolificdreamer/) | High-Fidelity and diverse Text-to-3D generation with Variational score Distillation. |[arXiv](https://arxiv.org/abs/2305.16213) | | Model | +| [SF3D](https://github.com/Stability-AI/stable-fast-3d) | SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement. |[arXiv](https://arxiv.org/abs/2408.00653) | | 3D | | [Shap-E](https://github.com/openai/shap-e) | Generate 3D objects conditioned on text or images. |[arXiv](https://arxiv.org/abs/2305.02463) | | Model | | [Sloyd](https://www.sloyd.ai/) | 3D modelling has never been easier. | | | Model | | [Spline AI](https://spline.design/ai) | The power of AI is coming to the 3rd dimension. Generate objects, animations, and textures using prompts. | | | Model | @@ -519,6 +527,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM, | [Cambrian-1](https://github.com/cambrian-mllm/cambrian) | Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs. |[arXiv](https://arxiv.org/abs/2406.16860) | | Multimodal LLMs | | [CogVLM2](https://github.com/THUDM/CogVLM2) | GPT4V-level open-source multi-modal model based on Llama3-8B. | | | Visual | | [CoTracker](https://co-tracker.github.io/) | It is Better to Track Together. |[arXiv](https://arxiv.org/abs/2307.07635) | | Visual | +| [EVF-SAM](https://github.com/hustvl/EVF-SAM) | EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model. |[arXiv](https://arxiv.org/abs/2406.20076) | | Visual | | [FaceHi](https://m.facehi.ai/) | It is Better to Track Together. | | | Visual | | [InternLM-XComposer2](https://github.com/InternLM/InternLM-XComposer) | InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension. |[arXiv](https://arxiv.org/abs/2404.06512) | | Visual | | [Kangaroo](https://github.com/KangarooGroup/Kangaroo) | Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input. | | | Visual | @@ -557,6 +566,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM, | [Boximator](https://boximator.github.io/) | Generating Rich and Controllable Motions for Video Synthesis. |[arXiv](https://arxiv.org/abs/2402.01566) | | Video | | [CoDeF](https://github.com/qiuyu96/codef) | Content Deformation Fields for Temporally Consistent Video Processing. |[arXiv](https://arxiv.org/abs/2308.07926) | | Video | | [CogVideo](https://models.aminer.cn/cogvideo/) | Generate Videos from Text Descriptions. | | | Video | +| [CogVideoX](https://github.com/THUDM/CogVideo) | CogVideoX is an open-source version of the video generation model, which is homologous to 清影. | | | Video | | [CogVLM](https://github.com/THUDM/CogVLM) | CogVLM is a powerful open-source visual language model (VLM). | | | Visual | | [CoNR](https://github.com/megvii-research/CoNR) | Genarate vivid dancing videos from hand-drawn anime character sheets(ACS). |[arXiv](https://arxiv.org/abs/2207.05378) | | Video | | [Decohere](https://www.decohere.ai/) | Create what can't be filmed. | | | Video | @@ -630,6 +640,7 @@ Here we will keep track of the latest AI Game Development Tools, including LLM, | [TATS](https://songweige.github.io/projects/tats/index.html) | Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer. | | | Video | | [Text2Video-Zero](https://github.com/Picsart-AI-Research/Text2Video-Zero) | Text-to-Image Diffusion Models are Zero-Shot Video Generators. |[arXiv](https://arxiv.org/abs/2303.13439) | | Video | | [TF-T2V](https://tf-t2v.github.io/) | A Recipe for Scaling up Text-to-Video Generation with Text-free Videos. |[arXiv](https://arxiv.org/abs/2312.15770) | | Video | +| [Tora](https://github.com/ali-videoai/Tora) | Tora: Trajectory-oriented Diffusion Transformer for Video Generation. |[arXiv](https://arxiv.org/abs/2407.21705) | | Video | | [Track-Anything](https://github.com/gaomingqi/Track-Anything) | Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything and XMem. |[arXiv](https://arxiv.org/abs/2304.11968) | | Video | | [Tune-A-Video](https://github.com/showlab/Tune-A-Video) | One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation. |[arXiv](https://arxiv.org/abs/2212.11565) | | Video | | [TwelveLabs](https://www.twelvelabs.io/) | Multimodal AI that understands videos like humans. | | | Video |