🚀AI作品设计工具介绍
AI Design Tools Introduction

特别注意 Particular NOTE😮

请严格遵守法律法规。请不要生成会对个人以及组织造成侵害的内容，此程序仅供科研、学习及个人娱乐使用。
Please strictly adhere to laws and regulations. Do not generate content that may harm individuals or organizations. These projects are for scientific research, learning, and personal entertainment purposes only.

如何访问 How to Access🖥️

下面的项目包含了两类地址：huggingface.co以及172.16.167.44
huggingface.co如果打不开，请连接WiFi415再进行访问。 172.16.167.44连接本区域任意WiFi后均可访问。
The following items include two types of addresses: huggingface.co and 172.16.167.44. If you can’t access huggingface.co, please connect to WiFi 415 and try again. After connecting to any local WiFi, you can access 172.16.167.44.
我们准备了4个ChatGPT 4.0账号，需要使用的同学请在设计区（415）的台式机上使用。
We have prepared 4 ChatGPT 4.0 accounts. Those who need to use them can do so on the desktop computers in the Design Zone (415).
为保证使用速度，链接请勿对外。| To ensure usage speed, please do not share the link on the Internet.

CONTENT

▶︎

all

running...

1. AI歌曲 Singing Voice Conversion🎤
2. 声音克隆 Voice Clone🐶
3. 图像生成 Stable Diffusion🎨
4. 图像描述🖼️
5. AI数字人 AI Digital Human👻
6. 3D 打印 3D Mesh📦
7.音频模型训练 modelscope Model Train🪄
8.音频模型训练 so-vits Model Train🪄
其他项目 Other Projects

EASY🤗

1. AI歌曲 Singing Voice Conversion🎤

简介:歌声音色转换模型，通过三种上传方式：粘贴B站歌曲视频链接、上传文件或录音，然后选择AI歌手，生成克隆声音、歌曲等。当然可以不局限于转换歌曲。
Introduction: The mode of the singing voice conversion is to create clone sounds, songs, etc. by pasting the link of music video from Bilibili and selecting the AI singer.

操作流程 Procedure

步骤1 Step1

从Bilibili选择需要的歌曲视频，粘贴网址到第一栏；或选择从麦克风/文件上传自己的声音文件。
Choose the music video you want from Bilibili and paste the link or the bv number to the first column or choose to upload your own audio file from the microphone/file.

步骤2 Step2

填写原音频克隆部分的起止时间（秒）。
Fill in the original audio clone section’s start-stop times (second).

步骤3 Step3

选择你想克隆音色的AI歌手。
Select the AI singer you want.

步骤4 Step4

点击Submit生成克隆歌曲。
click Submit to generate clone songs

project1

示例 Sample

制作一段明星（或其他人）演唱的歌曲。
You can produce songs by someone (or other).

提示 HINT

本项目的声音模型，可通过项目8进行训练获取。
The voice model of this project can be obtained by training through Project 8.

项目来源 Reference

so-vits-svc

使用地址 Address

2. 声音克隆 Voice Clone🐶

简介:上传10秒以上音频，得到模型，使用模型进行文字转语音。还可在其中加入笑声、咳嗽等元素（对英文支持较好）
Introduction: Upload an audio clip of more than 10 seconds to obtain the model. Use the model for text-to-speech conversion. You can also add elements like laughter, coughing, etc. (better support for English).

操作流程 Procedure

步骤1 Step1

上传音频，然后点击Create Voice进行训练
Upload the audio, and then click Create Voice to start training.

project02

步骤2 Step2

选择TTS标签页，在Input Text区域输入文字，然后点击Generate
Select the TTS tab, enter text in the Input Text area, and then click Generate.

示例 Sample

INPUT TEXT:
Hello, my name is[clears throat] Suno. And, uh — and I like pizza. [laughs]
But I also have other interests such as playing tic tac toe.
OUTPUT AUDIO:

更多用法 More details：https://huggingface.co/spaces/suno/bark

项目来源 Reference

Bark🐶

使用地址 Address

3. 图像生成 Stable Diffusion🎨

简介:可通过文字描述、上传图片，并选择各种模型来自动生成图片。
Introduction: Description: Via text description, upload images, and select a variety of models to generate images automatically.

操作流程 Procedure

步骤1 Step1

填写提示词（描述想生成的图像）与反向提示词（非必填，描述图片不想出现的内容）。
Fill in the prompt (describing the image you want to generate) and the negative prompt (not required, describing what the image does not want to appear)

步骤2 Step2

设置模型，图片参数与生成图片批次、数量等。
Set the model, the image parameters and the generated image batches, the number, etc. (default parameters can be used).

步骤3 Step3

点击生成生成图像。
Click Generate to generate the image.

project3

示例 Sample

可参考UISDC中的案例。
You can refer to the cases on UISDC.

项目来源 Reference

Stable Diffusion web UI

使用地址 Address

http://172.16.167.44:9001
http://172.16.167.44:9002
http://172.16.167.44:9003
http://172.16.167.44:9004
http://172.16.167.44:9005
http://172.16.167.44:9006
http://172.16.167.44:9007
http://172.16.167.44:9008

For users who are not on-site:
https://huggingface.co/spaces?search=stable diffusion

4. 图像描述🖼️

简介:可通过文字描述、上传图片，并选择各种模型来自动生成图片。
Introduction: Description: Via text description, upload images, and select a variety of models to generate images automatically.

操作流程 Procedure

步骤1 Step1

上传图像和提示词。
Upload image and prompt.

步骤2 Step2

点击Generate执行。
Click Generate to generate the texts.

project04

示例 Sample

可参考UISDC中的案例。
You can refer to the cases on UISDC.

5. AI数字人 AI Digital Human👻

简介:通过上传人物照片，并上传声音文件（可选：上传供模仿的参考视频），可生成脸部、嘴唇、眼睛会动的数字人。 Introduction: You can generate virtual human videos with moving facial features, lips, and eyes by uploading a photo and an audio file (you can also upload an optional reference video for imitation).

操作流程 Procedure

步骤1 Step1

上传图片。
Upload image.

步骤2 Step2

上传音频文件或参考视频。
Upload audio or video.

步骤3 Step3

点击「生成」生成视频文件。
Click “Generate” to generate video.

project2

示例 Sample

生成说“欢迎参加第六届全球未来教育设计大赛”的蒙娜丽莎数字人。
Generate the Mona Lisa SadTalker who can say “welcome to the sixth Global Competition on Design Future Education”.

项目来源 Reference

SadTalker

使用地址 Address

打开链接后，选择SadTalker标签页。
After opening the link, select the SadTalker tab.

For users who are not on-site:

https://huggingface.co/spaces/nijisakai/SadTalkerGPU
https://huggingface.co/spaces/vinthony/SadTalker

6. 3D 打印 3D Mesh📦

简介:上传图片，自动生成3D模型，连接3D打印机输出。
Introduction: Upload an image or select a sample image.

操作流程 Procedure

步骤1 Step1

上传图片。
Upload image.

步骤2 Step2

调整物品主体识别范围。
Adjust the subject identification range.

步骤3 Step3

点击Run Generation，生成物品3D模型文件及多角度图像。
Click Run Generation, Generate the subject 3D Model document and multi-view images.

project4

示例 Sample

打印大赛logo或团队logo。
Print the logo of Competition or your team.

使用地址 Address

https://huggingface.co/spaces/One-2-3-45/One-2-3-45

使用3D打印机 To Use 3D Printer

Site：Room 415，Computer next to the 3D printer.
Software：Creality Print, on Computer Desktop

点击软件左上角的File->Open File 打开一个obj文件。确保右侧Printer为图中型号。使用左侧功能区，对模型的大小等进行调节。然后点击右下角Slice按钮进入下一步。
Click on File in the top left corner of the software then Open File to open an obj file. Ensure the Printer on the right side matches the model shown in the image. Use the function area on the left to adjust the size of the model and other settings. Then click the Slice button in the bottom right corner to proceed to the next step.

project06

确认模型，点击右下角的Export to Local，导出文件格式为.gcode
Confirm the model and click Export to Local in the bottom right corner. Export the file in the .gcode format.

project06

将文件保存到sd卡中，并插入打印机，旋转按钮，选择Print，按下按钮进入文件选择界面，再次旋转按钮以选中要打印的文件，按下按钮进行打印。
Save the file to an SD card and insert it into the printer. Rotate the knob and select Print. Press the knob to enter the file selection interface. Rotate the knob again to select the file you want to print, and then press the knob to start printing.

project06

NOT EASY😥

7.音频模型训练 modelscope Model Train🪄

简介:在Jupyter Notebook中训练自己的声音模型。此模型对中文语音支持较好
Introduction: Train your voice to a model,This model has good support for Chinese speech.
Jupyter Notebook Usage：Ask ChatGPT

可能涉及的工具 Tools that might be involved

Online Recorder
Recorder for Windows
Audio format conversion for Windows

操作流程 Procedure

步骤1 Step1

准备自己的音频文件，WAV格式（至少一分钟）。
Get an Audio File with WAV format.(At least 1 mins)

步骤2 Step2

进入project1目录，把ipynb文件复制一份。（为防止多用户误删，项目文件为只读权限）
Enter the project1 directory and make a copy of the ipynb file. (To prevent accidental deletion by multiple users, the project file is read-only).

步骤3 Step3

上传你准备的Audio文件。
Upload the Audio file you have prepared.

步骤4 Step4

打开ipynb文件，根据Jupyter Notebook的提示进行逐步训练，并得到由模型生成的语音文件。
Open the ipynb file and follow the prompts of the Jupyter Notebook to train step by step, and obtain the voice file generated by the model.

project5

步骤5 Step5

开始新训练前请删除所有生成的文件和文件夹。
Before starting a new training, please delete all generated files and folders.

示例 Sample

用自己的声音模型朗讲述故事。
Narrate a story with one’s own voice model.

详细说明 Details

可参考这里
You can refer to this

提示 Hint

如果对代码有疑问，何不发给ChatGPT帮忙解读呢？
If you have questions about the code, why not send it to ChatGPT for help interpreting it?

#############
# Ask GPT:
#############
import sox
import subprocess
from pathlib import Path
import librosa
from scipy.io import wavfile
import numpy as np
import torch
import csv
import whisper

def split_long_audio(model, filepaths, character_name, save_dir="data_dir", out_sr=44100):
    if isinstance(filepaths, str):
        filepaths = [filepaths]

    for file_idx, filepath in enumerate(filepaths):

        save_path = Path(save_dir) / character_name
        save_path.mkdir(exist_ok=True, parents=True)

        print(f"Transcribing file {file_idx}: '{filepath}' to segments...")
        result = model.transcribe(filepath, word_timestamps=True, task="transcribe", beam_size=5, best_of=5)
        segments = result['segments']

        wav, sr = librosa.load(filepath, sr=None, offset=0, duration=None, mono=True)
        wav, _ = librosa.effects.trim(wav, top_db=20)
        peak = np.abs(wav).max()
        if peak > 1.0:
            wav = 0.98 * wav / peak
        wav2 = librosa.resample(wav, orig_sr=sr, target_sr=out_sr)
        wav2 /= max(wav2.max(), -wav2.min())

        for i, seg in enumerate(segments):
            start_time = seg['start']
            end_time = seg['end']
            wav_seg = wav2[int(start_time * out_sr):int(end_time * out_sr)]
            wav_seg_name = f"{character_name}_{file_idx}_{i}.wav"
            out_fpath = save_path / wav_seg_name
            wavfile.write(out_fpath, rate=out_sr, data=(wav_seg * np.iinfo(np.int16).max).astype(np.int16))
#############
GPT ANSWER:
#############
这段代码是一个名为`split_long_audio`的Python函数。以下是其功能及所导入库的简介：

1. **导入的库**:
    - `sox`: 一个用于处理SoX声音处理程序的库。
    - `subprocess`: 一个标准库模块，用于产生新的进程，连接它们的输入/输出/错误管道，并获取它们的返回码。
    - `pathlib`: 提供一套处理文件系统路径的类。
    - `librosa`: 用于分析和提取音频特征的库。
    - `scipy.io.wavfile`: 提供读取和写入WAV文件的函数。
    - `numpy`: 一个用于Python中的数值操作的库。
    - `torch`: 代表PyTorch，一个流行的深度学习库。
    - `csv`: 一个读取和写入CSV文件的模块。
    - `whisper`: 在2021年之前的Python生态系统中，这不是一个标准的库。这可能是一个定制的库或是在2021年之后开始使用的新库。

2. **`split_long_audio`的功能**:
    - 该函数接收一个模型（可能是一个转录模型）、一个音频文件路径列表或单个路径、一个角色名（可能用于命名分割段落）、一个可选的数据保存目录(`save_dir`)和一个可选的采样率(`out_sr`)。
    - 对于每个音频文件：
        1. 函数基于`character_name`创建一个目录。
        2. 它使用模型转录音频文件并获取每个单词或段落的时间戳。
        3. 使用`librosa`加载音频文件，修剪静音部分，并将其重新采样到所需的采样率(`out_sr`)。
        4. 对于音频中的每个识别段落：
            - 它提取该段。
            - 根据`character_name`、文件索引和段落索引为其命名。
            - 将段落写入之前创建的目录的一个WAV文件中。

3. **目的**:
    - 函数的主要目的似乎是处理长音频文件，根据转录将其分段，并将每个段落保存为一个单独的音频文件。

如果您对代码的任何部分或其功能有特定的问题，请告诉我！

#############
GPT ANSWER IN ENGLISH:
#############
This code is a Python function called `split_long_audio`. Here's a breakdown of its functionality and the libraries being imported:

1. **Imported Libraries**:
    - `sox`: A library to work with the SoX sound processing program.
    - `subprocess`: A standard library module to spawn new processes, connect to their input/output/error pipes, and obtain their return codes.
    - `pathlib`: Offers a set of classes to handle filesystem paths.
    - `librosa`: A library for analyzing and extracting features from audio.
    - `scipy.io.wavfile`: Provides functions for reading and writing WAV files.
    - `numpy`: A library for numerical operations in Python.
    - `torch`: Represents PyTorch, a popular deep learning library.
    - `csv`: A module to read from and write to CSV files.
    - `whisper`: This isn't a standard library in Python's ecosystem as of my last update in 2021. It might be a custom library or a newer one that's come into use after 2021.

2. **Functionality of `split_long_audio`**:
    - The function takes in a model (likely a transcription model), a list of audio file paths or a single path, a character name (probably for naming the split segments), an optional directory to save the data (`save_dir`), and an optional sampling rate (`out_sr`).
    - For each audio file:
        1. The function creates a directory based on the `character_name`.
        2. It uses the model to transcribe the audio file and get timestamps for each word or segment.
        3. It loads the audio file using `librosa`, trims silent parts, and resamples it to the desired sampling rate (`out_sr`).
        4. For each identified segment in the audio:
            - It extracts the segment.
            - Names it based on the `character_name`, file index, and segment index.
            - Writes the segment as a WAV file to the previously created directory.

3. **Purpose**:
    - The function's primary purpose seems to be processing long audio files, segmenting them based on transcriptions, and saving each segment as a separate audio file.

If you have specific questions about any part of the code or its functionality, please let me know!

使用地址 Address

password: sli2023

http://172.16.167.44:8881/tree/project1
http://172.16.167.44:8882/tree/project1
http://172.16.167.44:8883/tree/project1
http://172.16.167.44:8884/tree/project1
http://172.16.167.44:8885/tree/project1
http://172.16.167.44:8886/tree/project1
http://172.16.167.44:8887/tree/project1
http://172.16.167.44:8888/tree/project1

For users who are not on-site:
Download ipynb file and upload it to Online Jupyter Notebook Servers, such as
Google Colab
阿里魔搭社区
 元卓在线编程平台

8.音频模型训练 so-vits Model Train🪄

简介:使用so-vits-svc方式训练模型，上传到hugging face，并使用在线项目加载和推理模型。
Introduction: Train the model using the so-vits-svc method, upload it to Hugging Face, and load and infer the model in an online project.
Jupyter Notebook Usage：Ask ChatGPT

操作流程 Procedure

在ChatGPT的协助下，参看项目ipynb文档。
With the assistance of ChatGPT, review the ipynb project document.

示例 Sample

创建一个Hugging Face账号，训练完毕后下载模型，上传到自己的账号空间中。克隆项目1中的使用地址之一到自己的账号内，并加载此模型进行调用。
“Create a Hugging Face account, after training is complete, download the model and upload it to your own account space. Clone one of the usage URLs from Project 1 to your own account, and then load this model for invocation.”

使用地址 Address

password: sli2023

http://172.16.167.44:8881/tree/project2
http://172.16.167.44:8882/tree/project2
http://172.16.167.44:8883/tree/project2
http://172.16.167.44:8884/tree/project2
http://172.16.167.44:8885/tree/project2
http://172.16.167.44:8886/tree/project2
http://172.16.167.44:8887/tree/project2
http://172.16.167.44:8888/tree/project2

For users who are not on-site:
Download ipynb file and upload it to Online Jupyter Notebook Servers, such as
Google Colab
阿里魔搭社区
 元卓在线编程平台

其他项目 Other Projects

9. 在这里发现更多有趣的AI项目，并可在线运行 Discover more interesting AI projects here and run them online🤗

https://huggingface.co

10. AI助力选择困难症 AI aids decision-making paralysis🤨

https://eitherchoice.com

ec1 ec2

最后：众包整理AI项目合集，欢迎贡献 Crowdsourcing a compilation of AI projects, contributions are welcome🥰

https://docs.qq.com/sheet/DUm10WnpnUVVEQU1y

CONTACT

陈虹宇 Hongyu Chen
13161092527
Room 415

🚀AI作品设计工具介绍AI Design Tools Introduction

特别注意 Particular NOTE😮

如何访问 How to Access🖥️

CONTENT

EASY🤗

1. AI歌曲 Singing Voice Conversion🎤

操作流程 Procedure

示例 Sample

提示 HINT

项目来源 Reference

使用地址 Address

2. 声音克隆 Voice Clone🐶

操作流程 Procedure

示例 Sample

项目来源 Reference

使用地址 Address

3. 图像生成 Stable Diffusion🎨

操作流程 Procedure

示例 Sample

项目来源 Reference

使用地址 Address

4. 图像描述🖼️

操作流程 Procedure

示例 Sample

项目来源 Reference

使用地址 Address

补充 Addition

5. AI数字人 AI Digital Human👻

操作流程 Procedure

示例 Sample

项目来源 Reference

使用地址 Address

6. 3D 打印 3D Mesh📦

操作流程 Procedure

示例 Sample

使用地址 Address

使用3D打印机 To Use 3D Printer

NOT EASY😥

7.音频模型训练 modelscope Model Train🪄

可能涉及的工具 Tools that might be involved

操作流程 Procedure

示例 Sample

详细说明 Details

提示 Hint

使用地址 Address

8.音频模型训练 so-vits Model Train🪄

操作流程 Procedure

示例 Sample

使用地址 Address

其他项目 Other Projects

9. 在这里发现更多有趣的AI项目，并可在线运行 Discover more interesting AI projects here and run them online🤗

10. AI助力选择困难症 AI aids decision-making paralysis🤨

最后：众包整理AI项目合集，欢迎贡献 Crowdsourcing a compilation of AI projects, contributions are welcome🥰

CONTACT

🚀AI作品设计工具介绍
AI Design Tools Introduction