🚀AI作品设计工具介绍

AI Design Tools Introduction

特别注意 Particular NOTE😮

请严格遵守法律法规。请不要生成会对个人以及组织造成侵害的内容,此程序仅供科研、学习及个人娱乐使用。
Please strictly adhere to laws and regulations. Do not generate content that may harm individuals or organizations. These projects are for scientific research, learning, and personal entertainment purposes only.

如何访问 How to Access🖥️


CONTENT

▶︎
all
running...

EASY🤗


1. AI歌曲 Singing Voice Conversion🎤

简介:歌声音色转换模型,通过三种上传方式:粘贴B站歌曲视频链接、上传文件或录音,然后选择AI歌手,生成克隆声音、歌曲等。当然可以不局限于转换歌曲。
Introduction: The mode of the singing voice conversion is to create clone sounds, songs, etc. by pasting the link of music video from Bilibili and selecting the AI singer.

操作流程 Procedure

Bilibili选择需要的歌曲视频,粘贴网址到第一栏;或选择从麦克风/文件上传自己的声音文件。
Choose the music video you want from Bilibili and paste the link or the bv number to the first column or choose to upload your own audio file from the microphone/file.

填写原音频克隆部分的起止时间(秒)。
Fill in the original audio clone section’s start-stop times (second).

选择你想克隆音色的AI歌手。
Select the AI singer you want.

点击Submit生成克隆歌曲。
click Submit to generate clone songs

project1

示例 Sample

提示 HINT

本项目的声音模型,可通过项目8进行训练获取。
The voice model of this project can be obtained by training through Project 8.

项目来源 Reference

so-vits-svc

使用地址 Address

https://huggingface.co/spaces/nijisakai/Voice-Cloning-for-Bilibili
https://huggingface.co/spaces/nijisakai/Voice-Cloning-for-Bilibili02
https://huggingface.co/spaces/nijisakai/Voice-Cloning-for-Bilibili03
https://huggingface.co/spaces/nijisakai/Voice-Cloning-for-Bilibili04
https://huggingface.co/spaces/nijisakai/Voice-Cloning-for-Bilibili05
https://huggingface.co/spaces/nijisakai/Voice-Cloning-for-Bilibili06
https://huggingface.co/spaces/nijisakai/Voice-Cloning-for-Bilibili07
https://huggingface.co/spaces/nijisakai/Voice-Cloning-for-Bilibili08
https://huggingface.co/spaces/nijisakai/Voice-Cloning-for-Bilibili09
https://huggingface.co/spaces/nijisakai/Voice-Cloning-for-Bilibili10


2. 声音克隆 Voice Clone🐶

简介:上传10秒以上音频,得到模型,使用模型进行文字转语音。还可在其中加入笑声、咳嗽等元素(对英文支持较好)
Introduction: Upload an audio clip of more than 10 seconds to obtain the model. Use the model for text-to-speech conversion. You can also add elements like laughter, coughing, etc. (better support for English).

操作流程 Procedure

上传音频,然后点击Create Voice进行训练
Upload the audio, and then click Create Voice to start training.

project02

选择TTS标签页,在Input Text区域输入文字,然后点击Generate
Select the TTS tab, enter text in the Input Text area, and then click Generate.

示例 Sample

INPUT TEXT:
Hello, my name is[clears throat] Suno. And, uh — and I like pizza. [laughs]
But I also have other interests such as playing tic tac toe.
OUTPUT AUDIO:

更多用法 More details:https://huggingface.co/spaces/suno/bark

项目来源 Reference

Bark🐶

使用地址 Address

https://huggingface.co/spaces/nijisakai/clone-bark-cn
https://huggingface.co/spaces/nijisakai/clone-bark-cn02
https://huggingface.co/spaces/nijisakai/clone-bark-cn03
https://huggingface.co/spaces/nijisakai/clone-bark-cn04
https://huggingface.co/spaces/nijisakai/clone-bark-cn05
https://huggingface.co/spaces/nijisakai/clone-bark-cn06
https://huggingface.co/spaces/nijisakai/clone-bark-cn07
https://huggingface.co/spaces/nijisakai/clone-bark-cn08
https://huggingface.co/spaces/nijisakai/clone-bark-cn09
https://huggingface.co/spaces/nijisakai/clone-bark-cn10


3. 图像生成 Stable Diffusion🎨

简介:可通过文字描述、上传图片,并选择各种模型来自动生成图片。
Introduction: Description: Via text description, upload images, and select a variety of models to generate images automatically.

操作流程 Procedure

填写提示词(描述想生成的图像)与反向提示词(非必填,描述图片不想出现的内容)。
Fill in the prompt (describing the image you want to generate) and the negative prompt (not required, describing what the image does not want to appear)

设置模型,图片参数与生成图片批次、数量等。
Set the model, the image parameters and the generated image batches, the number, etc. (default parameters can be used).

点击生成生成图像。
Click Generate to generate the image.

project3

示例 Sample

项目来源 Reference

Stable Diffusion web UI

使用地址 Address

http://172.16.167.44:9001
http://172.16.167.44:9002
http://172.16.167.44:9003
http://172.16.167.44:9004
http://172.16.167.44:9005
http://172.16.167.44:9006
http://172.16.167.44:9007
http://172.16.167.44:9008

For users who are not on-site:
https://huggingface.co/spaces?search=stable diffusion


4. 图像描述🖼️

简介:可通过文字描述、上传图片,并选择各种模型来自动生成图片。
Introduction: Description: Via text description, upload images, and select a variety of models to generate images automatically.

操作流程 Procedure

上传图像和提示词。
Upload image and prompt.

点击Generate执行。
Click Generate to generate the texts.

project04

示例 Sample

项目来源 Reference

VisualGLM-6B

使用地址 Address

http://172.16.167.44:12011

补充 Addition

Use ChatGLM2-6B
http://172.16.167.44:12021
https://chatglm.cn


5. AI数字人 AI Digital Human👻

简介:通过上传人物照片,并上传声音文件(可选:上传供模仿的参考视频),可生成脸部、嘴唇、眼睛会动的数字人。 Introduction: You can generate virtual human videos with moving facial features, lips, and eyes by uploading a photo and an audio file (you can also upload an optional reference video for imitation).

操作流程 Procedure

上传图片。
Upload image.

上传音频文件或参考视频。
Upload audio or video.

点击「生成」生成视频文件。
Click “Generate” to generate video.

project2

示例 Sample

项目来源 Reference

SadTalker

使用地址 Address

http://172.16.167.44:9001
http://172.16.167.44:9002
http://172.16.167.44:9003
http://172.16.167.44:9004
http://172.16.167.44:9005
http://172.16.167.44:9006
http://172.16.167.44:9007
http://172.16.167.44:9008

https://huggingface.co/spaces/nijisakai/SadTalkerGPU
https://huggingface.co/spaces/vinthony/SadTalker


6. 3D 打印 3D Mesh📦

简介:上传图片,自动生成3D模型,连接3D打印机输出。
Introduction: Upload an image or select a sample image.

操作流程 Procedure

上传图片。
Upload image.

调整物品主体识别范围。
Adjust the subject identification range.

点击Run Generation,生成物品3D模型文件及多角度图像。
Click Run Generation, Generate the subject 3D Model document and multi-view images.

project4

示例 Sample

使用地址 Address

https://huggingface.co/spaces/One-2-3-45/One-2-3-45

使用3D打印机 To Use 3D Printer

Site:Room 415,Computer next to the 3D printer.
Software:Creality Print, on Computer Desktop

project06

project06

project06


NOT EASY😥


7.音频模型训练 modelscope Model Train🪄

简介:在Jupyter Notebook中训练自己的声音模型。此模型对中文语音支持较好
Introduction: Train your voice to a model,This model has good support for Chinese speech.
Jupyter Notebook Usage:Ask ChatGPT

可能涉及的工具 Tools that might be involved

Online Recorder
Recorder for Windows
Audio format conversion for Windows

操作流程 Procedure

准备自己的音频文件,WAV格式(至少一分钟)。
Get an Audio File with WAV format.(At least 1 mins)

进入project1目录,把ipynb文件复制一份。(为防止多用户误删,项目文件为只读权限)
Enter the project1 directory and make a copy of the ipynb file. (To prevent accidental deletion by multiple users, the project file is read-only).

上传你准备的Audio文件。
Upload the Audio file you have prepared.

打开ipynb文件,根据Jupyter Notebook的提示进行逐步训练,并得到由模型生成的语音文件。
Open the ipynb file and follow the prompts of the Jupyter Notebook to train step by step, and obtain the voice file generated by the model.

project5

开始新训练前请删除所有生成的文件和文件夹。
Before starting a new training, please delete all generated files and folders.

示例 Sample

详细说明 Details

可参考这里
You can refer to this

提示 Hint

如果对代码有疑问,何不发给ChatGPT帮忙解读呢?
If you have questions about the code, why not send it to ChatGPT for help interpreting it?

#############
# Ask GPT:
#############
import sox
import subprocess
from pathlib import Path
import librosa
from scipy.io import wavfile
import numpy as np
import torch
import csv
import whisper

def split_long_audio(model, filepaths, character_name, save_dir="data_dir", out_sr=44100):
    if isinstance(filepaths, str):
        filepaths = [filepaths]

    for file_idx, filepath in enumerate(filepaths):

        save_path = Path(save_dir) / character_name
        save_path.mkdir(exist_ok=True, parents=True)

        print(f"Transcribing file {file_idx}: '{filepath}' to segments...")
        result = model.transcribe(filepath, word_timestamps=True, task="transcribe", beam_size=5, best_of=5)
        segments = result['segments']

        wav, sr = librosa.load(filepath, sr=None, offset=0, duration=None, mono=True)
        wav, _ = librosa.effects.trim(wav, top_db=20)
        peak = np.abs(wav).max()
        if peak > 1.0:
            wav = 0.98 * wav / peak
        wav2 = librosa.resample(wav, orig_sr=sr, target_sr=out_sr)
        wav2 /= max(wav2.max(), -wav2.min())

        for i, seg in enumerate(segments):
            start_time = seg['start']
            end_time = seg['end']
            wav_seg = wav2[int(start_time * out_sr):int(end_time * out_sr)]
            wav_seg_name = f"{character_name}_{file_idx}_{i}.wav"
            out_fpath = save_path / wav_seg_name
            wavfile.write(out_fpath, rate=out_sr, data=(wav_seg * np.iinfo(np.int16).max).astype(np.int16))
#############
GPT ANSWER:
#############
这段代码是一个名为`split_long_audio`的Python函数。以下是其功能及所导入库的简介:

1. **导入的库**:
    - `sox`: 一个用于处理SoX声音处理程序的库。
    - `subprocess`: 一个标准库模块,用于产生新的进程,连接它们的输入/输出/错误管道,并获取它们的返回码。
    - `pathlib`: 提供一套处理文件系统路径的类。
    - `librosa`: 用于分析和提取音频特征的库。
    - `scipy.io.wavfile`: 提供读取和写入WAV文件的函数。
    - `numpy`: 一个用于Python中的数值操作的库。
    - `torch`: 代表PyTorch,一个流行的深度学习库。
    - `csv`: 一个读取和写入CSV文件的模块。
    - `whisper`: 在2021年之前的Python生态系统中,这不是一个标准的库。这可能是一个定制的库或是在2021年之后开始使用的新库。

2. **`split_long_audio`的功能**:
    - 该函数接收一个模型(可能是一个转录模型)、一个音频文件路径列表或单个路径、一个角色名(可能用于命名分割段落)、一个可选的数据保存目录(`save_dir`)和一个可选的采样率(`out_sr`)。
    - 对于每个音频文件:
        1. 函数基于`character_name`创建一个目录。
        2. 它使用模型转录音频文件并获取每个单词或段落的时间戳。
        3. 使用`librosa`加载音频文件,修剪静音部分,并将其重新采样到所需的采样率(`out_sr`)4. 对于音频中的每个识别段落:
            - 它提取该段。
            - 根据`character_name`、文件索引和段落索引为其命名。
            - 将段落写入之前创建的目录的一个WAV文件中。

3. **目的**:
    - 函数的主要目的似乎是处理长音频文件,根据转录将其分段,并将每个段落保存为一个单独的音频文件。

如果您对代码的任何部分或其功能有特定的问题,请告诉我!

#############
GPT ANSWER IN ENGLISH:
#############
This code is a Python function called `split_long_audio`. Here's a breakdown of its functionality and the libraries being imported:

1. **Imported Libraries**:
    - `sox`: A library to work with the SoX sound processing program.
    - `subprocess`: A standard library module to spawn new processes, connect to their input/output/error pipes, and obtain their return codes.
    - `pathlib`: Offers a set of classes to handle filesystem paths.
    - `librosa`: A library for analyzing and extracting features from audio.
    - `scipy.io.wavfile`: Provides functions for reading and writing WAV files.
    - `numpy`: A library for numerical operations in Python.
    - `torch`: Represents PyTorch, a popular deep learning library.
    - `csv`: A module to read from and write to CSV files.
    - `whisper`: This isn't a standard library in Python's ecosystem as of my last update in 2021. It might be a custom library or a newer one that's come into use after 2021.

2. **Functionality of `split_long_audio`**:
    - The function takes in a model (likely a transcription model), a list of audio file paths or a single path, a character name (probably for naming the split segments), an optional directory to save the data (`save_dir`), and an optional sampling rate (`out_sr`).
    - For each audio file:
        1. The function creates a directory based on the `character_name`.
        2. It uses the model to transcribe the audio file and get timestamps for each word or segment.
        3. It loads the audio file using `librosa`, trims silent parts, and resamples it to the desired sampling rate (`out_sr`).
        4. For each identified segment in the audio:
            - It extracts the segment.
            - Names it based on the `character_name`, file index, and segment index.
            - Writes the segment as a WAV file to the previously created directory.

3. **Purpose**:
    - The function's primary purpose seems to be processing long audio files, segmenting them based on transcriptions, and saving each segment as a separate audio file.

If you have specific questions about any part of the code or its functionality, please let me know!

使用地址 Address

password: sli2023

http://172.16.167.44:8881/tree/project1
http://172.16.167.44:8882/tree/project1
http://172.16.167.44:8883/tree/project1
http://172.16.167.44:8884/tree/project1
http://172.16.167.44:8885/tree/project1
http://172.16.167.44:8886/tree/project1
http://172.16.167.44:8887/tree/project1
http://172.16.167.44:8888/tree/project1

For users who are not on-site:
Download ipynb file and upload it to Online Jupyter Notebook Servers, such as
Google Colab
阿里魔搭社区
元卓在线编程平台


8.音频模型训练 so-vits Model Train🪄

简介:使用so-vits-svc方式训练模型,上传到hugging face,并使用在线项目加载和推理模型。
Introduction: Train the model using the so-vits-svc method, upload it to Hugging Face, and load and infer the model in an online project.
Jupyter Notebook Usage:Ask ChatGPT

操作流程 Procedure

示例 Sample

使用地址 Address

password: sli2023

http://172.16.167.44:8881/tree/project2
http://172.16.167.44:8882/tree/project2
http://172.16.167.44:8883/tree/project2
http://172.16.167.44:8884/tree/project2
http://172.16.167.44:8885/tree/project2
http://172.16.167.44:8886/tree/project2
http://172.16.167.44:8887/tree/project2
http://172.16.167.44:8888/tree/project2

For users who are not on-site:
Download ipynb file and upload it to Online Jupyter Notebook Servers, such as
Google Colab
阿里魔搭社区
元卓在线编程平台


其他项目 Other Projects


9. 在这里发现更多有趣的AI项目,并可在线运行 Discover more interesting AI projects here and run them online🤗

https://huggingface.co

hf

10. AI助力选择困难症 AI aids decision-making paralysis🤨

https://eitherchoice.com

ec1 ec2

最后:众包整理AI项目合集,欢迎贡献 Crowdsourcing a compilation of AI projects, contributions are welcome🥰

https://docs.qq.com/sheet/DUm10WnpnUVVEQU1y


CONTACT

陈虹宇 Hongyu Chen
13161092527
Room 415

  1. AI歌曲 Singing Voice Conversion🎤
  1. 声音克隆 Voice Clone🐶
  1. 图像生成 Stable Diffusion🎨
  1. 图像描述🖼️
  1. AI数字人 AI Digital Human👻
  1. 3D 打印 3D Mesh📦

7.音频模型训练 modelscope Model Train🪄

8.音频模型训练 so-vits Model Train🪄

其他项目 Other Projects

最后:众包整理AI项目合集,欢迎贡献 Crowdsourcing a compilation of AI projects, contributions are welcome🥰