Multimedia program development refers to the technical field that integrates text, images, audio, video and animation to implement interactive functions through programming language. Its development focuses on hardware acceleration, coding efficiency and user experience smoothness.
| Development areas | Commonly used languages | Technical framework/tools |
|---|---|---|
| Web multimedia | JavaScript / TypeScript | HTML5 Canvas, WebGL, Three.js |
| Mobile Apps/Games | C++ / C# / Swift | Unity, Unreal Engine, Metal |
| Back-end audio and video processing | Python / Go / C++ | FFmpeg, OpenCV, GStreamer |
Note: When developing multimedia programs that involve a large amount of calculations, hardware decoding should be given priority to reduce CPU load.
DirectX is a series of application programming interfaces (APIs) developed by Microsoft to allow software (especially games) to communicate directly with hardware such as graphics cards and sound effects cards. It is a core pillar of multimedia development for Windows platforms and Xbox consoles.
| Version | Important features | Applicable environment |
|---|---|---|
| DirectX 11 | Introducing surface tessellation (Tessellation) and multi-thread rendering for high stability. | Windows 7 and above |
| DirectX 12 | The underlying API (Low-level) greatly reduces CPU overhead and supports multi-core scheduling of graphics cards. | Windows 10 / 11 |
| DirectX 12 Ultimate | Integrate next-generation technologies such as Ray Tracing and Mesh Shaders. | High-End GPUs and Xbox Series X/S |
Note: In modern game development, developers usually call DirectX through engines such as Unity or Unreal Engine instead of directly writing low-level instructions to improve development efficiency.
Media Foundation (MF) is a multimedia framework launched by Microsoft after Windows Vista and is designed to replace the old DirectShow. It adopts a new pipeline design and is optimized for high-resolution video, digital rights management (DRM) and more efficient hardware acceleration. It is the core technology for modern Windows applications to process audio and video.
Media Foundation breaks down the multimedia processing process into three main levels. This design provides extremely high flexibility of control:
| characteristic | Media Foundation | DirectShow (old version) |
|---|---|---|
| High resolution support | Natively optimized for 4K, 8K and HDR content. | The scalability is limited and it is difficult to handle ultra-high resolution. |
| Hardware acceleration | Deeply integrated with DXVA 2.0, extremely efficient. | Depending on specific filter implementation, performance may vary. |
| Content protection | Built-in PMP (Protected Media Path) supports DRM. | There is a lack of unified copyright protection mechanism. |
| Thread model | Use asynchronous topology to reduce UI freezes. | Synchronous execution model can easily lead to interface lag. |
Note: Although Media Foundation has excellent performance, its API design is relatively complex and rigorous. It is recommended that developers use the MFTrace tool provided by Microsoft for debugging to track the event flow in the media pipeline.
DirectShow is a multimedia framework based on the Component Object Model (COM), mainly used for audio and video capture and playback on the Windows platform. Although Microsoft later launched Media Foundation as its successor, DirectShow is still widely used in industrial cameras, medical imaging, and traditional audio and video software due to its strong compatibility and flexibility.
The core concept of DirectShow is the Filter Graph, which processes multimedia data by connecting different filters into links:
| Functional classification | illustrate |
|---|---|
| media playback | Supports integration of multiple container formats (such as AVI, WMV, MP4) and codecs. |
| Image capture | Provides a standard interface for communicating with WDM (Windows Driver Model) devices, suitable for USB cameras. |
| Hardware acceleration | Hardware-accelerated rendering can be performed using the graphics card via Video Mixing Renderer (VMR) or EVR. |
| format conversion | Supports resampling, cropping, and color space conversion (such as YUV to RGB) of real-time video streams. |
Note: When carrying out modern development, if you do not need to support older systems, Microsoft recommends giving priority to using Media Foundation, which has more advantages in handling high-resolution content and digital rights management (DRM).
Vulkan is a next-generation cross-platform graphics and computing API developed by Khronos Group. Unlike OpenGL, Vulkan is a low-level API designed to provide more direct hardware control, minimize the driver's overhead, and improve the utilization of multi-core processors.
Vulkan’s design logic requires developers to assume more management responsibilities in exchange for ultimate performance:
| characteristic | Vulkan | OpenGL |
|---|---|---|
| Driver burden | Very low, most logic is implemented by developers. | At a higher level, the driver takes care of a lot of background management. |
| Multi-thread support | Native support for parallel task distribution. | Mainly relies on a single thread. |
| Development complexity | Extremely high, the amount of code is usually several times that of OpenGL. | Medium, more friendly to beginners. |
| Hardware utilization | High, can accurately control GPU computing and memory. | , limited by the abstraction level of the API. |
Note: Due to the extremely high development threshold of Vulkan, it is usually recommended for 3D game engine cores that require extreme performance (such as id Tech 7) or scientific simulation programs that require cross-platform high-performance computing.
OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library for real-time image processing and analysis.
# Read the image and display it
import cv2
image = cv2.imread("image.jpg")
cv2.imshow("Image", image)
cv2.waitKey(0)
cv2.destroyAllWindows()
In OpenCV, the core function for reading images iscv::imread. It will load the image file ascv::MatMatrix format.
#include <opencv2/opencv.hpp>
// Grammar prototype
cv::Mat img = cv::imread(const std::string& filename, int flags = cv::IMREAD_COLOR);
Commonly used tags (Flags):
Key ideas:cv::imreadfailed andNo C++ exceptions are thrown, so traditional try-catch is not effective for it. When the read fails (such as path error, unsupported format or insufficient permissions), it will return an emptycv::Matobject.
The correct processing flow should be usedempty()Member function to check:
#include <opencv2/opencv.hpp>
#include <iostream>
int main() {
std::string path = "data/image.jpg";
cv::Mat img = cv::imread(path);
// Must check if the image is loaded successfully
if (img.empty()) {
std::cerr << "Error: Unable to read image file!" << std::endl;
std::cerr << "Please confirm whether the path is correct:" << path << std::endl;
return -1;
}
//Execute the operation after successful reading
std::cout << "Image width: " << img.cols << " Height: " << img.rows << std::endl;
return 0;
}
ifimg.empty()is true, usually due to the following reasons:
| reason | Explanation and Countermeasures |
|---|---|
| File path error | Most common reasons. Please check whether the relative path is relative to the executable directory, or use an absolute path. |
| Unsupported file extension | OpenCV needs a corresponding decoder (such as libjpeg, libpng). If OpenCV is compiled without support, it cannot be read. |
| Chinese path problem | In Windows environment, old version or specific compilation environmentcv::imreadPoor support for Chinese paths. |
| Insufficient permissions | The user executing the program does not have operating system permissions to read the file. |
If reading fails due to a Windows Chinese path, it is recommended to read the file into the memory Buffer first, and thencv::imdecodeTo decode:
#include <fstream>
#include <vector>
cv::Mat imread_unicode(std::string path) {
std::ifstream fs(path, std::ios::binary | std::ios::ate);
if (!fs.is_open()) return cv::Mat();
std::streamsize size = fs.tellg();
fs.seekg(0, std::ios::beg);
std::vector<char> buffer(size);
if (fs.read(buffer.data(), size)) {
return cv::imdecode(cv::Mat(buffer), cv::IMREAD_COLOR);
}
return cv::Mat();
}
When the order of point groups (such as screw edges or sine waves) is disordered, they must first be projected in the direction of the fitted straight line and sorted, and then the points can be correctly grouped according to their positive and negative offsets relative to the straight line (Signed Distance). The following is an implementation plan for integrating OpenCV and standard C++.
First implement the specified point distance sorting function you require. This can be used to locate a starting point or a specific feature point.
#include <vector>
#include <array>
#include <algorithm>
#include <opencv2/opencv.hpp>
using Point2D = std::array<float, 2>;
using Points = std::vector<Point2D>;
namespace GeometryPointsUtil {
bool FindSortedPointsByDistOfPoint(Points& retPoints, const Points& allPoints, const Point2D& aPoint) {
if (allPoints.empty()) return false;
retPoints = allPoints;
std::sort(retPoints.begin(), retPoints.end(), [&aPoint](const Point2D& p1, const Point2D& p2) {
float dx1 = p1[0] - aPoint[0];
float dy1 = p1[1] - aPoint[1];
float dx2 = p2[0] - aPoint[0];
float dy2 = p2[1] - aPoint[1];
// Use sum of squares comparison to avoid sqrt operation overhead
return (dx1 * dx1 + dy1 * dy1) < (dx2 * dx2 + dy2 * dy2);
});
return true;
}
}
For oscillating lines, this function will automatically fit the straight line, sort the projection, and segment it according to both sides of the straight line.
std::vector<Points> splitOscillatingPoints(const Points& allPoints) {
if (allPoints.size() < 2) return {allPoints};
// 1. Straight line fitting
std::vector<cv::Point2f> cvPts;
for (const auto& p : allPoints) cvPts.push_back({p[0], p[1]});
cv::Vec4f line; // (vx, vy, x0, y0)
cv::fitLine(cvPts, line, cv::DIST_L2, 0, 0.01, 0.01);
float vx = line[0], vy = line[1], x0 = line[2], y0 = line[3];
// 2. Projection sorting: ensure that the points are arranged along a straight line
struct ProjectedPoint {
Point2D original;
float t; // projection length
float side; // algebraic distance to straight line
};
std::vector<ProjectedPoint> projected;
float nx = -vy; // normal vector x
float ny = vx; // normal vector y
for (const auto& p : allPoints) {
float dx = p[0] - x0;
float dy = p[1] - y0;
float t = dx * vx + dy * vy; // Displacement projected onto a straight line
float s = dx * nx + dy * ny; // Distance perpendicular to the straight line (including plus and minus signs)
projected.push_back({p, t, s});
}
std::sort(projected.begin(), projected.end(), [](const ProjectedPoint& a, const ProjectedPoint& b) {
return a.t < b.t;
});
// 3. Grouping based on positive and negative sign transitions
std::vector<Points> segments;
if (projected.empty()) return segments;
Points currentGroup;
bool lastSide = (projected[0].side >= 0);
for (const auto& item : projected) {
bool currentSide = (item.side >= 0);
if (currentSide != lastSide && !currentGroup.empty()) {
segments.push_back(currentGroup);
currentGroup.clear();
}
currentGroup.push_back(item.original);
lastSide = currentSide;
}
if (!currentGroup.empty()) segments.push_back(currentGroup);
return segments;
}
Halcon is a powerful industrial vision software developed by MVTec, specifically designed for image processing and machine vision applications.
Shotcut is a free and open source video editing software that supports multiple formats and has many powerful editing tools. Features include:
Applicable platforms: Windows, Mac, Linux
OpenShot is an easy-to-use open source video editing tool that is powerful and supports multiple formats. Its main features include:
Applicable platforms: Windows, Mac, Linux
Blender is a well-known open source 3D modeling and animation software with a built-in powerful video editor suitable for video editing and special effects production. Its features include:
Applicable platforms: Windows, Mac, Linux
Kdenlive is a widely used open source video editing software on Linux and also supports Windows. Its main functions include:
Applicable platforms: Windows, Mac, Linux
Lightworks offers free and paid versions, with the free version offering basic editing features. Features include:
Applicable platforms: Windows, Mac, Linux
The above open source video editing software provides powerful functions that are suitable for different levels of video editing needs, from simple home video editing to professional video production.
| Software name | Approximate search volume |
|---|---|
| OpenShot | 110,000 |
| Kdenlive | 90,500 |
| Shotcut | 49,500 |
| Avidemux | 18,100 |
| Losslesscut | 14,800 |
| Blender VSE | 10,000 |
| Natron | 6,600 |
| Cinelerra | 5,400 |
| Pitivi | 3,600 |
| LiVES | 1,600 |
OpenShot is a free and open source video editor, the project name isOpenShot/openshot-qt, mainly based onPythonandQtdevelopment. The project aims to provide an easy-to-use and feature-rich video editing tool suitable for users of all levels.
OpenShot usesPyQtas a graphical user interface and combined withlibopenshot(C++ implementation) to handle the core logic of video editing. Additionally, OpenShot leveragesFFmpegTo support decoding and encoding of multiple formats.
OpenShot is suitable for users who need simple yet powerful video editing needs. Whether for amateur video creators or for educational purposes, OpenShot provides flexible tools and plug-ins to make editing and creation easy.
The OpenShot project has an active open source community, and users and developers can contribute code, report issues, or submit new feature suggestions through GitHub. Everyone is welcome to participate and help improve the functionality and stability of OpenShot.
Users can download the source code through the GitHub page, or download the executable file from the OpenShot official website. Detailed installation instructions and documentation are also available on GitHub.
import os
import subprocess
def create_kdenlive_project(project_path, video_path, audio_path, srt_path):
"""
Create a basic Kdenlive XML project file and import assets
"""
# Get the absolute path of the file to ensure Kdenlive can read it correctly
video_abs = os.path.abspath(video_path)
audio_abs = os.path.abspath(audio_path)
srt_abs = os.path.abspath(srt_path)
# Basic Kdenlive MLT structure (simplified version)
kdenlive_xml = f"""<?xml version="1.0" encoding="UTF-8"?>
<mlt version="7.24.0" title="Auto Generated Project">
<producer id="video_main" resource="{video_abs}"/>
<producer id="audio_main" resource="{audio_abs}"/>
<producer id="subtitle_main" resource="{srt_abs}"/>
<playlist id="main_bin">
<entry producer="video_main"/>
<entry producer="audio_main"/>
<entry producer="subtitle_main"/>
</playlist>
<tractor id="main_timeline">
<multitrack>
<track name="Video Track">
<entry producer="video_main" in="0" out="1000"/>
</track>
<track name="Audio Track">
<entry producer="audio_main" in="0" out="1000"/>
</track>
</multitrack>
</tractor>
</mlt>
"""
with open(project_path, "w", encoding="utf-8") as f:
f.write(kdenlive_xml)
print(f"Project file has been generated: {project_path}")
def open_with_kdenlive(project_path, kdenlive_exe_path):
"""
Start Kdenlive and load the generated project
"""
try:
# Use subprocess to open the program and bring in file parameters
subprocess.Popen([kdenlive_exe_path, project_path])
print("Starting Kdenlive...")
except Exception as e:
print(f"Startup failed: {e}")
if __name__ == "__main__":
# Set file path
MY_VIDEO = "input_video.mp4"
MY_AUDIO = "output_voice.wav"
MY_SRT = "output_subtitle.srt"
SAVE_PROJECT = "auto_project.kdenlive"
# Kdenlive executable file path (Windows example, Linux usually uses 'kdenlive' directly)
KDENLIVE_PATH = r"C:\Program Files\kdenlive\bin\kdenlive.exe"
# 1. Generate project file
create_kdenlive_project(SAVE_PROJECT, MY_VIDEO, MY_AUDIO, MY_SRT)
# 2. Start Kdenlive
open_with_kdenlive(SAVE_PROJECT, KDENLIVE_PATH)
import subprocess
# Define a simple MLT XML structure
# This XML defines the playback order of two pieces of material.
mlt_xml_content = """<mlt>
<producer id="clip1" resource="video_part1.mp4" />
<producer id="clip2" resource="video_part2.mp4" />
<playlist id="main_track">
<entry producer="clip1" in="0" out="150" />
<entry producer="clip2" in="0" out="300" />
</playlist>
</mlt>
"""
#Write content to file
with open("auto_edit.mlt", "w", encoding="utf-8") as f:
f.write(mlt_xml_content)
def render_video(mlt_file, output_file):
"""
Use the melt command line tool to render videos directly (without opening the GUI)
"""
# melt is the command line interface tool for MLT
command = [
"melt",
mlt_file,
"-consumer", f"avformat:{output_file}",
"acodec=aac", "vcodec=libx264", "preset=fast"
]
try:
print(f"Start background rendering: {output_file}...")
subprocess.run(command, check=True)
print("Rendering completed!")
except FileNotFoundError:
print("Error: The melt executable file cannot be found, please confirm whether the MLT framework is installed.")
if __name__ == "__main__":
#Perform rendering
render_video("auto_edit.mlt", "final_result.mp4")
This script uses image recognition to position UI elements. Before executing, please capture the small icons of the "Image and Text into Movies" and "Generate Video" buttons in the editing interface and save them asbtn_start.pngandbtn_generate.pngStore it in the same directory as the program code.
Please install the necessary Python libraries first:
pip install pyautogui pyperclip opencv-python
import os
import time
importpyautogui
import pyperclip
# Set parameters
JIANYING_PATH = r"C:\Users\YourName\AppData\Local\JianyingPro\Apps\JianyingPro.exe" # Please replace it with your actual path
SCRIPT_FILE = "my_script.txt" # Pre-prepared script file
CONFIDENCE_LEVEL = 0.8 # Image recognition accuracy (0-1)
def run_automation():
# 1. Read the content of the document
if not os.path.exists(SCRIPT_FILE):
print("Error: Document file not found")
return
with open(SCRIPT_FILE, "r", encoding="utf-8") as f:
content = f.read()
# 2. Turn on clipping
print("Starting clipping...")
os.startfile(JIANYING_PATH)
time.sleep(8) # Wait for the software to fully load
try:
# 3. Locate and click the "Picture and Text into Film" button
start_btn = pyautogui.locateCenterOnScreen('btn_start.png', confidence=CONFIDENCE_LEVEL)
if start_btn:
pyautogui.click(start_btn)
print("You have entered the picture and text interface")
time.sleep(2)
else:
print("Unable to locate the "Picture and Text into Film" button")
return
# 4. Process document input
pyperclip.copy(content) # Copy the document to the clipboard
pyautogui.click(x=pyautogui.size().width//2, y=pyautogui.size().height//2) # Click the center of the window to ensure focus
pyautogui.hotkey('ctrl', 'v')
print("The manuscript has been pasted")
time.sleep(1)
# 5. Locate and click "Generate Video"
gen_btn = pyautogui.locateCenterOnScreen('btn_generate.png', confidence=CONFIDENCE_LEVEL)
if gen_btn:
pyautogui.click(gen_btn)
print("Generating project...")
else:
print("Unable to locate the "Generate Video" button")
except Exception as e:
print(f"An error occurred: {e}")
if __name__ == "__main__":
run_automation()
| step | illustrate |
|---|---|
| Image capture | When capturing images, try to capture only the text or icon in the center of the button, and avoid including too many background colors to increase compatibility under different background themes. |
| Time Sleep | The most common reason for failure in automation is "the program clicked before the software responded." Please adjust according to computer performancetime.sleepvalue. |
| Fail-Safe | PyAutoGUI has a built-in protection mechanism: quickly moving the mouse to the "upper left corner" of the screen can immediately terminate the program. |
.pngAs shown in the figure, recognition may fail due to changes in screen resolution or zoom ratio (DPI).pygetwindowThe library forces the clipping window to be the active foreground window.Note: Frequent UI automation may become invalid due to software updates (interface location changes). If long-term stable operation is required, studying "Path 2: Modify JSON Draft" will be a more robust solution.
ffmpeg、ffprobe、ffplayffmpeg -i input.avi output.mp4ffmpeg -ss 00:00:10 -i input.mp4 -t 5 output.mp4ffmpeg -i input.mp4 -q:a 0 -map a output.mp3ffmpeg -i input.mp4 -vf subtitles=sub.srt output.mp4In multimedia development, it is a basic requirement to ensure that the execution environment has FFmpeg. Via PythonsubprocessModules andurllib, we can implement automated environment configuration process.
The program code is mainly divided into two stages: detecting the system path and remote downloading and decompression.
shutil.which()Look for executable files. This is the most stable way to detect PATH across platforms.sys.platformDecide on the download link (usually .zip for Windows, .tar.xz for Linux).os.environ["PATH"]。import os
import shutil
import platform
import urllib.request
import zipfile
def ensure_ffmpeg():
# 1. Check whether the system PATH already has ffmpeg
if shutil.which("ffmpeg"):
print("FFmpeg already exists in the system path.")
return True
print("FFmpeg not detected, ready to start downloading...")
# 2. Download information according to operating system settings (taking Windows as an example)
if platform.system() == "Windows":
url = "https://www.gyan.dev/ffmpeg/builds/ffmpeg-release-essentials.zip"
target_zip = "ffmpeg.zip"
extract_dir = "ffmpeg_bin"
# Download file
urllib.request.urlretrieve(url, target_zip)
# decompress
with zipfile.ZipFile(target_zip, 'r') as zip_ref:
zip_ref.extractall(extract_dir)
# Find the decompressed bin directory and add environment variables
# The actual path depends on the compressed package structure.
ffmpeg_path = os.path.abspath(os.path.join(extract_dir, "ffmpeg-release-essentials", "bin"))
os.environ["PATH"] += os.pathsep + ffmpeg_path
print(f"FFmpeg has been deployed to: {ffmpeg_path}")
return True
else:
print("The current example only supports automatic download for Windows, please install manually for other systems.")
return False
# Perform checks
ensure_ffmpeg()
| project | illustrate |
|---|---|
| Permissions issue | Under Linux or macOS, the downloaded binary may need to beos.chmod(path, 0o755)Grant execution permissions. |
| version locked | It is recommended to download from a reliable source (such as Gyan.dev or BtbN) and confirm that the version is compatible with your code. |
| Network timeout | FFmpeg is larger in size, so it is recommended to add it when downloading.try-exceptTo handle network outages, or userequestsThe library displays a progress bar. |
AppDataOr the project root directory to avoid repeated downloads.Note: In a production environment, frequent downloads of large binaries may impact user experience. It is recommended to prompt the user when starting for the first time, or to preload it in the installation package.
The most common and stable method for screen recording in Python is to combinePyAutoGUI(for capturing images),OpenCV(for encoding and storing videos) andNumPy(for processing image data).
First you need to install the necessary packages. Open a terminal and execute the following commands:
pip install opencv-python pyautogui numpy
The following code will capture the full screen image and save it as an output.mp4 file. Press the q key on your keyboard to stop recording.
import cv2
importpyautogui
import numpy as np
# Get screen resolution
SCREEN_SIZE = tuple(pyautogui.size())
# Define video encoding format (FourCC)
fourcc = cv2.VideoWriter_fourcc(*"mp4v")
# Create VideoWriter object (file name, encoding, frame rate, resolution)
out = cv2.VideoWriter("output.mp4", fourcc, 20.0, SCREEN_SIZE)
print("Recording... Press the 'q' key to stop.")
try:
while True:
# Capture screen
img = pyautogui.screenshot()
# Convert to NumPy array
frame = np.array(img)
# Convert color from RGB to BGR (OpenCV standard format)
frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
# Write frames to video file
out.write(frame)
# Show video preview (optional)
# cv2.imshow("Preview", frame)
# Detect keyboard input
if cv2.waitKey(1) == ord("q"):
break
finally:
# Release resources and close the window
out.release()
cv2.destroyAllWindows()
print("The recording is over and the file has been saved.")
| Problem phenomenon | Reasons and suggestions |
|---|---|
| Video plays too fast | The actual recorded FPS is lower than the set value. The writing FPS should be lowered, or a more efficient retrieval library such as mss should be used instead. |
| Color is abnormal | Forgot to do the COLOR_RGB2BGR conversion. |
| Stuttering when executing the code | Capturing a high-resolution screen is very CPU-intensive. It is recommended to lower the screen resolution or record only a specific area. |
Manim (Mathematical Animation Engine) is an animation library written in Python, specifically used to create mathematical images and animations. Manim It can be used to generate high-quality animations that illustrate mathematical concepts, code execution processes, or anything else represented by images and animations.
Manim animation is generally completed by writing Python scripts and then generating video files. Each animation usually contains one or more scenes (Scene), and each scene is composed of different objects (Mobject).
from manim import *
class MyFirstScene(Scene):
def construct(self):
text = Text("Hello, Manim!") # Create a text object
self.play(Write(text)) # Generate animation
Manim can be installed via pip:
pip install manim
OpenGL (Open Graphics Library) is a cross-language, cross-platform application programming interface (API) for rendering 2D and 3D vector graphics. It is maintained by the Khronos Group and is widely used in computer-aided design (CAD), virtual reality, scientific visualization, and video game development.
OpenGL uses a pipeline architecture to convert 3D data into pixels on the screen. Modern OpenGL core mode relies heavily on shaders:
| characteristic | illustrate |
|---|---|
| Cross-platform compatibility | Runs on Windows, Linux, macOS (via translation layer) and mobile devices (OpenGL ES). |
| State machine model | OpenGL operates like a huge state machine. Developers set the state (such as current color, bound texture) and then execute drawing instructions. |
| GLSL language | Use C-like OpenGL Shading Language to write GPU programs, which has powerful computing capabilities. |
| Extension mechanism | Allow hardware manufacturers to introduce new graphics card functions through Extension without updating the API standard. |
Note: Although Vulkan has been regarded as the successor of OpenGL, providing lower-level hardware control, OpenGL is still the first choice for learning graphics program development due to its relatively simple entry barrier and rich documentation.
ManimGL is an efficient variant of Manim for making mathematical animations, focusing on OpenGL acceleration to improve rendering speed.
Install using pip:
pip install manimgl
Or get the latest version from GitHub:
git clone https://github.com/ManimCommunity/ManimGL.git
cd ManimGL
pip install -e .
Render a simple scene using ManimGL:
from manimlib import *
class HelloManim(Scene):
def construct(self):
text = Text("Hello, ManimGL!")
self.play(Write(text))
self.wait(2)
Run command:
manimgl script.py HelloManim
If you encounter installation or operational problems, try:
pip install --upgrade pipBlender is an open source and all-in-one 3D creation software that covers a complete pipeline from modeling, animation, rendering to compositing and video editing. Known for its powerful Cycles rendering engine and flexible Python API, it is a core tool for independent developers and small and medium-sized studios.
Blender's architecture is extremely compact and uses multiple dedicated engines to work together:
| characteristic | illustrate | |
|---|---|---|
| Python API | Almost the entire UI and functions can be controlled through Python scripts, making it easy to develop add-ons. | It natively supports Windows, macOS (Apple Silicon) and Linux, and the file format (.blend) is universal across all platforms. |
| Integrated pipeline | Built-in video editor (VSE) and compositor (Compositor), no need to switch software to complete post-production. |
For developers who need to batch process 3D materials or automate modeling, Blender provides a powerful background mode:
blender -b -P script.py, you can perform automated tasks without opening the graphical interface.Note: Blender updates very quickly (about one version every three months). When developing scripts, you need to pay attention to API compatibility changes between different versions.
bpyThe module is a Python API designed specifically for Blender that allows users to create, modify, and manage 3D images and animations through code within Blender.
bpy?bpyis the abbreviation of Blender Python, which is a set of function libraries that allow the use of Python scripts to operate the core functions of Blender. throughbpy, users can:
bpyMain modules and features ofbpyContains multiple sub-modules, each with a specific purpose:
bpy.data: Access all data in Blender (such as objects, materials, scenes, etc.).bpy.ops: Operation class, perform operations (such as moving, rotating, scaling objects).bpy.context: Access the current Blender state (such as selected objects or enabled tools).bpy.types: Define all data structures in Blender (such as Mesh, Camera, Material).bpy.utils: Provides some auxiliary functions (such as script loading and unloading).The following is usedbpySimple example of creating a cube:
import bpy
# Delete existing objects
bpy.ops.object.select_all(action='SELECT')
bpy.ops.object.delete(use_global=False)
#Add cube
bpy.ops.mesh.primitive_cube_add(size=2, enter_editmode=False, align='WORLD', location=(0, 0, 0))
bpy?usebpyAllows you to automate repetitive tasks and produce complex models, animations and renderings. For professionals such as game designers, architects, and animators,bpyPowerful tools are provided to optimize workflow.
To learn more aboutbpyFor module details, please refer to the official documentation:Blender Python API Documentation
Unity is a powerful game development engine and platform designed for creating 2D and 3D games, interactive applications, and virtual reality (VR) and augmented reality (AR) experiences. It provides an easy-to-use interface and rich tools, suitable for both beginners and professional developers.
Unity is a powerful and flexible development engine that provides developers with a wide range of application scenarios and tool support. Whether you are a beginner or a professional developer, you can use Unity to quickly create high-quality 2D and 3D games and interactive applications.
Cocos is the world's leading open source mobile game development framework, including the early pure code-driven Cocos2d-x and the modern full-featured editor Cocos Creator. Known for its lightweight, efficient and cross-platform support, it is the preferred tool for developing 2D and 3D mobile games and mini-games (such as WeChat mini-games and TikTok mini-games).
The Cocos family is mainly divided into two important development stages to meet the needs of different development habits:
| characteristic | illustrate |
|---|---|
| Extremely cross-platform | Supports iOS, Android, Windows, Mac and various web browsers and instant game platforms. |
| High performance renderer | The bottom layer uses the self-developed GFX abstraction layer, which supports multiple graphics backends such as Vulkan, Metal, DirectX and WebGL. |
| Lightweight and bulky | The engine core is compact and the packaged game starts up quickly, making it suitable for platforms with limited network environments or high reading speed requirements. |
| TypeScript support | Cocos Creator deeply integrates TypeScript, provides complete type checking and syntax prompts, and reduces the difficulty of maintaining large projects. |
Note: Cocos Creator has now evolved to version 3.x, which fully integrates the core technologies of 2D and 3D. Developers can mix and produce 2D UI and 3D scenes in the same project.
Developing a speech synthesis system is usually divided into three stages. first isFront-end processing, convert the original text into linguistic features (such as word segmentation, phonetic symbol conversion, prosody prediction); followed byacoustic model, map these features into acoustic representation (such as mel spectrum); finallyVocoder, responsible for reducing the acoustic representation into human-audible waveform audio.
| category | Tools/Models | Development features |
|---|---|---|
| Open source framework | Coqui TTS / ESPnet | Modular design supports a large number of pre-trained models and Fine-tuning |
| lightweight engine | MeloTTS / Kokoro | CPU friendly, suitable for edge computing or embedded devices |
| Conversation optimization | ChatTTS | Designed specifically for spoken dialogue, supporting the insertion of laughter, catchphrases and other details |
| Research grade model | StyleTTS 2 / VITS | Based on Generative Adversarial Network (GAN), the sound quality is extremely close to real people |
To develop a TTS with a specific timbre, you need to prepare a high-quality data set (usually 1 to 10 hours of recordings and corresponding text). Commonly used by developersTransfer Learningtechnology, fine-tuning on large base models, significantly reduces data volume requirements and improves sound similarity and naturalness.
For most application developers, directly calling mature cloud APIs is the most efficient solution. For exampleElevenLabs APIProvide strong emotional expression,Microsoft Azure Speech SDKProvides the most complete SSML (Speech Synthesis Markup Language) support, allowing developers to precisely control pauses, stress, and tone through tags. also,OpenAI TTS APIWith its simple interface and extremely low reasoning delay, it is very popular in real-time interactive applications.
In the early stages of development, it is recommended to give priority to the balance between "latency (RTF)" and "sound quality". If it is applied to real-time customer service, low-latency streaming (Streaming) is the key; if it is applied to audio books, priority should be given to pursuing a model with long text processing capabilities and a rich sense of rhythm. In addition, it is necessary to pay attention to the G2P (character to phoneme) support status of each language, which directly determines the correct understanding of pronunciation.
CosyVoice 2 is an advanced version of Alibaba’s open source speech synthesis (TTS) model. Compared with the first generation, it has achieved significant breakthroughs in pronunciation accuracy, fine-grained emotion control, and streaming reasoning latency. It not only supports high-quality tone cloning, but also introduces command-controllable technology to make AI speech more "human".
CosyVoice 2 uses "text-speech language model" and "Flow Matching" technology to achieve end-to-end speech generation:
| Function | CosyVoice 2 Description |
|---|---|
| Multi-language support | Supports Chinese, English, Japanese, Korean and multiple dialects (Cantonese, Sichuan, Shanghainese, Tianjin, etc.). |
| emotion/command control | Voice emotion and speaking speed can be controlled through commands (such as "Speak happily", "Speak angrily"). |
| 3 seconds super fast cloning | Zero-shot high-fidelity sound reproduction can be achieved with just 3 to 10 seconds of sample audio. |
| mixed language synthesis | It supports mixing Chinese and English multiple languages in the same text, and the timbre remains highly consistent. |
Note: When deploying CosyVoice 2 locally, it is recommended to equip an NVIDIA graphics card with at least 8GB of video memory and use the officially recommendedvLLMAccelerate the framework for optimal RTF (real-time rate) performance.
CosyVoice 2 is developed based on Python. Since it involves complex audio processing and deep learning environments, it is strongly recommended to use it.CondaVirtual environment for isolated installation. Currently, Linux has the highest official support, and Windows users are recommended to deploy through WSL2 or a specific community modified version.
Before starting, please make sure your system has the NVIDIA driver (recommended 8GB or more video memory) and Conda installed.
conda create -n cosyvoice2 python=3.10
conda activate cosyvoice2
Pynini is the core component that handles text normalization and must be installed through conda:
conda install -y -c conda-forge pynini==2.1.5
git clone --recursive https://github.com/FunAudioLLM/CosyVoice.git
cd CosyVoice
pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/
CosyVoice 2 requires downloading pre-trained model weights. You can automate the download via a Python script:
from modelscope import snapshot_download
# Download 0.5B main model
snapshot_download('iic/CosyVoice2-0.5B', local_dir='pretrained_models/CosyVoice2-0.5B')
# Download text normalization resources
snapshot_download('iic/CosyVoice-ttsfrd', local_dir='pretrained_models/CosyVoice-ttsfrd')
CozyVoice 2 offers a variety of modes to suit your needs, from quick dubbing to professional cloning:
| usage pattern | Operating Instructions | Applicable scenarios |
|---|---|---|
| Start WebUI | implementpython webui.py, open the visual interface in the browser. |
Manual dubbing and quick test effects. |
| 3 seconds extremely fast reproduction | Upload 3-10 seconds of reference audio and corresponding text to achieve sound cloning. | Personalized voice package, self-media dubbing. |
| Cross-language/dialect | Input Chinese text and select Cantonese or Sichuan dialect tone output. | Localized content production. |
| command control | Add a command before the text (eg: [laughter], [angry]). | Audiobooks, dramatized voice-overs. |
If you want to integrate CosyVoice 2 into your own Python project (such as Kdenlive's automation script):
from cosyvoice.cli.cosyvoice import CosyVoice2
import torchaudio
#Initialize the model
cosyvoice = CosyVoice2('pretrained_models/CosyVoice2-0.5B')
# Execute inference (taking pre-trained timbre as an example)
output = cosyvoice.inference_sft('Hello, I am an artificial intelligence voice assistant.', 'Chinese female')
# Save message
torchaudio.save('output.wav', output['tts_speech'], cozyvoice.sample_rate)
Note: If you are installing on Windows and encountersoxor compilation error, please refer to GitHub Issue #1046, or try to use one-click installation package.
import os
import torch
import torchaudio
import re
from cosyvoice.cli.cosyvoice import CosyVoice
#Initialize CosyVoice2 model
# Make sure the path points to the folder containing the core weights and configuration files
cosyvoice = CosyVoice('pretrained_models/CosyVoice2-0.5B')
def segment_text(text, limit=80):
"""
Divide long articles into segments of appropriate length based on punctuation marks to avoid interruptions in speech generation or memory overflow.
"""
# Lock common end-of-sentence punctuation in Chinese and English
pattern = r'([.!?;!\?\n])'
parts = re.split(pattern, text)
chunks = []
current = ""
for i in range(0, len(parts)-1, 2):
sentence = parts[i] + parts[i+1]
if len(current) + len(sentence) <= limit:
current += sentence
else:
if current:
chunks.append(current.strip())
current=sentence
if current:
chunks.append(current.strip())
return [c for c in chunks if c]
def run_tts_pipeline(text, spk_id, file_name):
"""
Perform long text inference and combine information at the Tensor level
"""
text_list = segment_text(text)
combined_tensors = []
print(f"Processing, the article has been divided into {len(text_list)} sections")
for idx, segment in enumerate(text_list):
# Call CosyVoice2 inference interface
# Can be switched to inference_zero_shot to use reference audio
result = cozyvoice.inference_sft(segment, spk_id)
combined_tensors.append(result['tts_speech'])
print(f"Completed: {idx + 1}/{len(text_list)}")
if combined_tensors:
# Use torch.cat for seamless splicing
final_audio = torch.cat(combined_tensors, dim=1)
# Save as wav, the recommended sampling rate is 22050Hz
torchaudio.save(file_name, final_audio, 22050)
print(f"Task successful! File saved to: {file_name}")
if __name__ == "__main__":
long_content = "Paste the content of your long article here, and this code will automatically handle segmentation and merging."
run_tts_pipeline(long_content, 'Chinese female', 'output_v2.wav')
import torch
import torchaudio
import re
from cosyvoice.cli.cosyvoice import CosyVoice
#Initialize CosyVoice 2
cosyvoice = CosyVoice('pretrained_models/CosyVoice2-0.5B')
def format_srt_time(seconds):
"""Convert seconds to SRT time format HH:MM:SS,mmm"""
milliseconds = int((seconds - int(seconds)) * 1000)
seconds = int(seconds)
minutes, seconds = divmod(seconds, 60)
hours, minutes = divmod(minutes, 60)
return f"{hours:02}:{minutes:02}:{seconds:02},{milliseconds:03}"
def generate_audio_and_srt(full_text, speaker_id, output_wav, output_srt):
# Split long articles according to punctuation marks
segments = re.split(r'([.!?;!\?\n])', full_text)
chunks = []
for i in range(0, len(segments)-1, 2):
text = (segments[i] + segments[i+1]).strip()
if text: chunks.append(text)
audio_list = []
srt_entries = []
current_time = 0.0
sample_rate = 22050
print(f"Start processing {len(chunks)} text...")
for i, chunk in enumerate(chunks):
# Reasoning to generate speech tensor
output = cosyvoice.inference_sft(chunk, speaker_id)
audio_tensor = output['tts_speech']
audio_list.append(audio_tensor)
# Calculate the number of seconds this audio segment lasts (tensor length / sampling rate)
duration = audio_tensor.shape[1] / sample_rate
end_time = current_time + duration
# Create SRT entries
srt_entries.append(
f"{i+1}\n"
f"{format_srt_time(current_time)} --> {format_srt_time(end_time)}\n"
f"{chunk}\n"
)
current_time = end_time
print(f"Alignment of segment {i+1} completed")
# Merge and save audio
combined_audio = torch.cat(audio_list, dim=1)
torchaudio.save(output_wav, combined_audio, sample_rate)
# Save SRT file
with open(output_srt, 'w', encoding='utf-8') as f:
f.write("\n".join(srt_entries))
print(f"Completed! Audio: {output_wav}, subtitles: {output_srt}")
if __name__ == "__main__":
article = "This is a long article example. [laughter] We can accurately calculate the time of each sentence. In this way, it will be automatically aligned when imported into Kdenlive."
generate_audio_and_srt(article, 'Chinese female', 'output.wav', 'output.srt')
import os
import torch
import torchaudio
import re
from cosyvoice.cli.cosyvoice import CosyVoice
#Initialize the model
cosyvoice = CosyVoice('pretrained_models/CosyVoice2-0.5B')
def segment_text_with_tags(text, limit=100):
"""
Split long text while ensuring tags like [laughter] are not cut
"""
# Match Chinese punctuation marks and newlines
pattern = r'([.!?;!\?\n])'
parts = re.split(pattern, text)
chunks = []
current = ""
for i in range(0, len(parts)-1, 2):
sentence = parts[i] + parts[i+1]
if len(current) + len(sentence) <= limit:
current += sentence
else:
if current:
chunks.append(current.strip())
current=sentence
if current:
chunks.append(current.strip())
return chunks
def generate_expressive_audio(text, spk_id, output_path):
"""
Generate long speech containing emotional instructions
"""
segments = segment_text_with_tags(text)
audio_data = []
for idx, seg in enumerate(segments):
# Use instruct mode for better tag execution
# If you use sft mode, basic tags are also supported, but instruct mode is more precise for emotional control.
output = cosyvoice.inference_instruct(seg, spk_id, 'Control tone and emotion')
audio_data.append(output['tts_speech'])
print(f"Processing paragraphs {idx+1}/{len(segments)}")
if audio_data:
final_wav = torch.cat(audio_data, dim=1)
torchaudio.save(output_path, final_wav, 22050)
print(f"Message containing emotion command has been saved: {output_path}")
if __name__ == "__main__":
#Example: long text embedding emotion tags
rich_text = "This is great news! [laughter] I can't believe it. [surprise] But if this messes up, [angry] I will be very angry."
generate_expressive_audio(rich_text, 'Chinese female', 'expressive_output.wav')
Developing an ASR (Automatic Speech Recognition) system usually follows the following core path. first isaudio preprocessing(such as noise reduction, VAD voice activity detection and feature extraction); then enterModel inference, converting acoustic signals into text probability; finally throughPost-processing(such as punctuation recovery, inverse text normalization ITN) to produce the final text. Modern development trends have shifted from traditional HMM to "End-to-End" neural network architecture, which greatly simplifies development complexity.
| category | Tools/Models | Development features for 2026 |
|---|---|---|
| base model | OpenAI Whisper (V3) | Industry standard, with strong noise immunity and multi-language support, it is most suitable for transcribing long audio files. |
| Live streaming | NVIDIA Parakeet-TDT | Designed for ultra-low latency, supports streaming, and is suitable for AI voice assistants. |
| Domestic optimization | FunASR / Yating engine | It is deeply optimized for Chinese, Chinese-English mixed and Taiwanese accents, and supports timestamp and speaker recognition. |
| Deployment framework | Faster-Whisper / Sherpa-ONNX | Significantly improves inference speed and reduces memory usage, making it suitable for running on edge devices or local servers. |
When developing an ASR system, focus on monitoringCER (Character Error Rate)to assess accuracy. For immediate applications,RTF (real time factor)andLatencyCrucially, it is important to ensure that speech processing is much faster than speaking. The development focus in 2026 has shifted to "long text memory" and "context awareness", such as integrating LLM to correct identification biases in professional terminology or specific industries.
If developers pursue rapid launch, they usually call cloud APIs.DeepgramandAssemblyAIIt will be favored in 2026 for its low latency and rich metadata (such as emotion detection, key summaries).Microsoft Azure Speech SDKIt provides the most complete custom model fine-tuning (Custom Speech) interface, allowing developers to upload text data in specific fields to solve the problem of inaccurate recognition of special vocabulary such as medical and legal.
For individual developers, it is recommended to useHugging Face TransformersLibrary matchingPyTorchRun a quick experiment. If the application scenario involves privacy (such as medical records), you should useWhisper.cpporVoskPerform a completely offline local deployment. If you need to build a large-scale voice service, it is recommended to useTriton Inference ServerorDockerContainerization technology enables efficient scheduling and expansion of the ASR model.
HTML5<canvas>An element is an area that can be drawn using JavaScript, allowing 2D and 3D to be rendered on the web page
images. It is a container that can perform drawing operations through programming code, such as drawing lines, graphics and pictures. It is suitable for applications such as games and graphics editing that require real-time generation.
The following iscanvasBasic syntax of elements:
<canvas id="myCanvas" width="500" height="500"></canvas>
to be incanvasTo draw content on an element, you must usegetContextmethod. This method allows to obtain the drawing context, currently the most commonly used option is "2d". it will return aCanvasRenderingContext2DObject, providing many drawing methods.
For example, the following JavaScript code getscanvas2D drawing context for:
var canvas = document.getElementById("myCanvas");
var ctx = canvas.getContext("2d");
usegetContext("2d")The obtained drawing context can perform basic drawing operations such as drawing lines, drawing rectangles, and filling colors. For example:
moveTo(x, y)andlineTo(x, y)method to define line segments and usestroke()Draw the lines.fillRect(x, y, width, height)draw a filled rectangle, orstrokeRect(x, y, width, height)Draw a hollow rectangle.fillStyleSet the fill color, for examplectx.fillStyle = "blue";Sample code:
ctx.fillStyle = "blue";
ctx.fillRect(50, 50, 100, 100); // Draw a blue rectangle
ctx.strokeStyle = "red";
ctx.beginPath();
ctx.moveTo(0, 0);
ctx.lineTo(200, 200);
ctx.stroke(); // Draw a red line
To clearcanvasImages in can be usedclearRect(x, y, width, height)method. For example, the code to clear the entire canvas is:
ctx.clearRect(0, 0, canvas.width, canvas.height);
userequestAnimationFrame()Smooth animation effects can be achieved. Dynamic effects can be drawn by clearing the contents of the previous frame before updating the screen each time. Here is a simple animation example:
function draw() {
ctx.clearRect(0, 0, canvas.width, canvas.height);
ctx.fillRect(x, y, 50, 50); // Draw a square
x += 1; // update position
requestAnimationFrame(draw);
}
draw();
The size of the Canvas should be set in HTML. Changing the size using CSS may cause image distortion. also,canvasNot intended to replace high-resolution images, but to be used for instant generation and dynamic drawing.
style.transformIt is one of the properties of CSS and can be used to perform 2D or 3D transformation operations such as rotation, scaling, displacement, and tilt on elements.
scale()It is the "zoom" function, and the syntax is:
transform: scale(sx [, sy]);
in:
sx:Horizontal zoom factorsy: Vertical scaling factor (can be omitted, when omitted it is equal tosx)const el = document.getElementById("target");
el.style.transform = "scale(1.5)"; // Both x and y are enlarged by 1.5 times
el.style.transform = "scale(1.5, 0.5)"; // Zoom in 1.5 times horizontally and reduce it to half vertically
---
scale()is a "visual transformation" that does not change the actual DOM properties of the element (e.g.offsetWidthorclientWidth), but will changegetBoundingClientRect()return value.
el.getBoundingClientRect().width // Will reflect the influence of scale
el.offsetWidth //Original width, not affected by scale
---
<style>
#box {
width: 100px;
height: 100px;
background: skyblue;
transition: transform 0.3s;
}
#box:hover {
transform: scale(1.5);
}
</style>
<div id="box"></div>
---
The following example is from HTML<table>Read data, use native<canvas>API'sarc()Draw pie charts without any external packages.
<table id="dataTable" border="1" style="margin:10px auto;">
<tr><th>Category</th><th>Value</th></tr>
<tr><td>Apple</td><td>30</td></tr>
<tr><td>Banana</td><td>15</td></tr>
<tr><td>Cherry</td><td>25</td></tr>
<tr><td>Mango</td><td>20</td></tr>
</table>
<canvas id="pieCanvas" width="400" height="400" style="display:block; margin:auto; border:1px solid #aaa;"></canvas>
---
const table = document.getElementById("dataTable");
const canvas = document.getElementById("pieCanvas");
const ctx = canvas.getContext("2d");
const labels = [];
const values = [];
for (let i = 1; i < table.rows.length; i++) { // 跳過表頭
const row = table.rows[i];
labels.push(row.cells[0].textContent);
values.push(parseFloat(row.cells[1].textContent));
}
// 計算總和
const total = values.reduce((a, b) =>a + b, 0);
//Draw a pie chart
let startAngle = 0;
const centerX = canvas.width / 2;
const centerY = canvas.height / 2;
const radius = 120;
// Automatic color matching
const colors = ["#FF6384", "#36A2EB", "#FFCE56", "#4BC0C0", "#9966FF", "#FF9F40"];
for (let i = 0; i< values.length; i++) {
const sliceAngle = (values[i] / total) * 2 * Math.PI;
const endAngle = startAngle + sliceAngle;
// 畫圓餅區塊
ctx.beginPath();
ctx.moveTo(centerX, centerY);
ctx.arc(centerX, centerY, radius, startAngle, endAngle);
ctx.closePath();
ctx.fillStyle = colors[i % colors.length];
ctx.fill();
// 標示文字
const midAngle = startAngle + sliceAngle / 2;
const textX = centerX + Math.cos(midAngle) * (radius + 20);
const textY = centerY + Math.sin(midAngle) * (radius + 20);
ctx.fillStyle = "black";
ctx.font = "14px sans-serif";
ctx.textAlign = "center";
ctx.fillText(labels[i], textX, textY);
startAngle = endAngle;
}
// 標題
ctx.font = "16px bold sans-serif";
ctx.textAlign = "center";
ctx.fillText("水果銷售比例", centerX, centerY - radius - 30);
---
arc()Draw each sector.You can add mouse events (such as hover to zoom in or show percentage), or userequestAnimationFrame()Add animation effects. Do you want me to add a mouse hover version that displays the data percentage?
SVG (Scalable Vector Graphics) is an XML-based vector graphics format that can draw lines, graphics, and text on web pages, and supports scaling and animation. Unlike bitmaps, SVG will not be distorted when zoomed in or out, making it suitable for applications such as charts, ICONs, maps, and flowcharts.
<svg width="200" height="100">
<rect x="10" y="10" width="50" height="50" fill="blue" />
<circle cx="100" cy="35" r="25" fill="green" />
<line x1="150" y1="10" x2="190" y2="60" stroke="red" stroke-width="2" />
<text x="10" y="90" font-size="14" fill="black">This is SVG</text>
</svg>
<rect>:rectangle<circle>: round<ellipse>:oval<line>: line segment<polyline>、<polygon>: Polyline and polygon<path>: Free drawing path (can draw curves and complex shapes)<text>:text<svg width="100" height="100">
<circle cx="50" cy="50" r="40" fill="orange" onclick="alert('You clicked on the circle')" />
</svg>
permeableCSSor<animate>Label animation:
<circle cx="30" cy="50" r="20" fill="blue">
<animate attributeName="cx" from="30" to="170" dur="2s" repeatCount="indefinite" />
</circle>
<svg id="mysvg" width="200" height="100">
<circle id="c1" cx="50" cy="50" r="30" fill="gray" />
</svg>
<script>
document.getElementById("c1").setAttribute("fill", "red");
</script>
SVG is one of the very important graphics standards in web front-ends. It has high resolution, interactivity and animation, and can be seamlessly integrated with HTML/CSS/JavaScript. Suitable for graphic representation scenarios that require precise, scalable performance.
Available in SVG via<symbol>or<defs>Define the pattern once and use it again<use>Repeat references elsewhere, saving code and improving consistency.
<svg width="0" height="0" style="position:absolute">
<symbol id="star" viewBox="0 0 100 100">
<polygon points="50,5 61,39 98,39 68,59 79,91 50,70 21,91 32,59 2,39 39,39"
fill="gold" stroke="black" stroke-width="2"/>
</symbol>
</svg>
<svg width="200" height="100">
<use href="#star" x="0" y="0" width="50" height="50"/>
<use href="#star" x="60" y="0" width="50" height="50" fill="red"/>
<use href="#star" x="120" y="0" width="50" height="50" fill="blue"/>
</svg>
<symbol>: Define pattern content and can be used repeatedly<use>: Insert pattern, position and size can be specifiedhref: Points to the id of symbol (the old way of writing isxlink:href)<use>Can be changedfill、strokeand other attributes, overwriting the original definition.
<use href="#id">For modern writing (used by older browsersxlink:href)<symbol>Place it at the top of the DOM or absolutely hide itthrough<symbol> + <use>, SVG can realize componentized and modular graphics development, which can be reused and conveniently manage styles and positions. It is very suitable for graphic design and data visualization applications.
WebGL (Web Graphics Library) is a set of JavaScript APIs based on OpenGL ES that can use HTML5 in the browser<canvas>Elements perform hardware-accelerated drawing of 2D and 3D graphics without any plug-ins.
Draw a colored triangle:
<canvas id="glCanvas" width="300" height="300"></canvas>
<script>
const canvas = document.getElementById('glCanvas');
const gl = canvas.getContext('webgl');
if (!gl) {
alert("Your browser does not support WebGL");
}
const vertexShaderSource = `
attribute vec2 a_position;
void main() {
gl_Position = vec4(a_position, 0, 1);
}
`;
const fragmentShaderSource = `
void main() {
gl_FragColor = vec4(1, 0, 0, 1); // red
}
`;
function createShader(gl, type, source) {
const shader = gl.createShader(type);
gl.shaderSource(shader, source);
gl.compileShader(shader);
return shader;
}
const vertexShader = createShader(gl, gl.VERTEX_SHADER, vertexShaderSource);
const fragmentShader = createShader(gl, gl.FRAGMENT_SHADER, fragmentShaderSource);
const program = gl.createProgram();
gl.attachShader(program, vertexShader);
gl.attachShader(program, fragmentShader);
gl.linkProgram(program);
gl.useProgram(program);
const positionBuffer = gl.createBuffer();
gl.bindBuffer(gl.ARRAY_BUFFER, positionBuffer);
gl.bufferData(gl.ARRAY_BUFFER, new Float32Array([
0, 1,
-1, -1,
1, -1
]), gl.STATIC_DRAW);
const posAttribLoc = gl.getAttribLocation(program, "a_position");
gl.enableVertexAttribArray(posAttribLoc);
gl.vertexAttribPointer(posAttribLoc, 2, gl.FLOAT, false, 0, 0);
gl.clearColor(0, 0, 0, 1);
gl.clear(gl.COLOR_BUFFER_BIT);
gl.drawArrays(gl.TRIANGLES, 0, 3);
</script>
WebGL provides web developers with GPU-accelerated 3D graphics rendering capabilities and is one of the core technologies for modern web games, digital art, simulation and visualization. Although native WebGL is relatively low-level, it can be used with high-order function libraries to simplify the development process.
SpirographIt is a geometric pattern used to create complex shapes. The principle is to use the rotation of two circles to depict multiple circles and wavy curves. This type of graphics is often used in artistic creation and education to show the geometric beauty of mathematics.
Here's an example of implementing a Spirograph using HTML5's <canvas> element and JavaScript:
| library name | grammatical expressiveness | Graphic type | Suitable for objects | Whether to support interaction | Whether to support animation |
|---|---|---|---|---|---|
| Mermaid.js | Extremely high (using Markdown-like syntax) | Flow chart, sequence chart, Gantt chart, ER chart, Class chart | Document visualization, rapid prototyping | Limited support | Partial support |
| D3.js | Medium (needs to understand data binding and DOM operations) | Almost any graphics (extremely customizable) | Advanced data visualization developer | Full support | Full support |
| Cytoscape.js | High (nodes and edges defined in JSON) | Network diagram, flow chart | Bioinformatics, social network analysis | Full support | Partial support |
| Vega / Vega-Lite | High (use JSON declarative description of chart) | Statistical charts (bar charts, scatter charts, etc.) | Data Science, Dashboard Design | support | Partial support |
| Graphviz via Viz.js | High (DOT syntax is similar to text programming) | Flow chart, graph theory structure | Academic use, quick architecture diagram | Not supported | Not supported |
| JSXGraph | High (geometric semantics are clear) | Geometric figures, coordinate diagrams | mathematics education | support | support |
Chart.js is an open source, lightweight and powerful JavaScript chart drawing function library.
Available in HTML5<canvas>Draw various interactive diagrams on elements.
It is known for its simple API, beautiful default styles and highly customizable options.
Suitable for quickly visualizing data on websites or applications.
<script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
npm install chart.js
<canvas id="myChart"></canvas>
<script>
const ctx = document.getElementById('myChart').getContext('2d');
new Chart(ctx, {
type: 'bar',
data: {
labels: ['red', 'blue', 'yellow', 'green', 'purple', 'orange'],
datasets: [{
label: 'votes',
data: [12, 19, 3, 5, 2, 3],
backgroundColor: [
'rgba(255, 99, 132, 0.6)',
'rgba(54, 162, 235, 0.6)',
'rgba(255, 206, 86, 0.6)',
'rgba(75, 192, 192, 0.6)',
'rgba(153, 102, 255, 0.6)',
'rgba(255, 159, 64, 0.6)'
],
borderWidth: 1
}]
},
options: {
responsive: true,
scales: {
y: { beginAtZero: true }
}
}
});
</script>
---
| chart type | Set type | Instructions for use |
|---|---|---|
| Line chart | line | Display time series or trend data. |
| bar chart | bar | Compare values from different categories. |
| pie chart | pie | Shows overall proportional distribution. |
| donut chart | doughnut | A variation of the pie chart, the center can be left blank to display the title. |
| radar chart | radar | Comparison of multidimensional data. |
| polar region map | polarArea | The effect of combining round cakes and strips. |
You can check the version of Chart.js using:
console.log(Chart.version);
---
The following example demonstrates how to extract an HTML<table>Read the data and use JavaScript to dynamically draw a pie chart. This example usesChart.js, easy to use and supports automatic color matching and animation.
<!-- Table data -->
<table id="dataTable" border="1" style="margin:10px auto;">
<tr><th>Category</th><th>Value</th></tr>
<tr><td>Apple</td><td>30</td></tr>
<tr><td>Banana</td><td>15</td></tr>
<tr><td>Cherry</td><td>25</td></tr>
<tr><td>Mango</td><td>20</td></tr>
</table>
<!-- Pie chart container -->
<canvas id="pieChart" width="400" height="400"></canvas>
<!-- Load Chart.js -->
<script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
---
//Read table data
const table = document.getElementById("dataTable");
const labels = [];
const values = [];
for (let i = 1; i< table.rows.length; i++) { // 跳過表頭
const row = table.rows[i];
labels.push(row.cells[0].textContent);
values.push(parseFloat(row.cells[1].textContent));
}
// 建立 Chart.js 圓餅圖
const ctx = document.getElementById("pieChart").getContext("2d");
new Chart(ctx, {
type: "pie",
data: {
labels: labels,
datasets: [{
data: values,
backgroundColor: [
"rgba(255, 99, 132, 0.7)",
"rgba(54, 162, 235, 0.7)",
"rgba(255, 206, 86, 0.7)",
"rgba(75, 192, 192, 0.7)"
],
borderColor: "white",
borderWidth: 2
}]
},
options: {
responsive: true,
plugins: {
legend: { position: "bottom" },
title: { display: true, text: "水果銷售比例" }
}
}
});
---
tableDynamically retrieve data without manual definition.type: "pie"for"doughnut"Can be switched to a donut chart.If you want to draw in pure JavaScript (without using an external library), you can useCanvasRenderingContext2D.arc()Draw the fan shape yourself. Want me to show you the "without Chart.js" version?
In HTML you can use<svg>tags to draw basic UML class diagrams. Here's an example of how to use a rectangle and text to represent a simple category.
<svg width="300" height="200">
<rect x="50" y="20" width="200" height="30" fill="lightblue" stroke="black"/>
<text x="60" y="40" font-family="Arial" font-size="16">Class Name</text>
<rect x="50" y="50" width="200" height="50" fill="white" stroke="black"/>
<text x="60" y="70" font-family="Arial" font-size="14">+ attribute1 : Type</text>
<text x="60" y="90" font-family="Arial" font-size="14">+ attribute2 : Type</text>
<rect x="50" y="100" width="200" height="50" fill="white" stroke="black"/>
<text x="60" y="120" font-family="Arial" font-size="14">+ method1() : ReturnType</text>
<text x="60" y="140" font-family="Arial" font-size="14">+ method2() : ReturnType</text>
</svg>
Different UML elements can be defined using HTML and CSS styles. The following example shows how to use<div>andCSSto draw a category box and adjust its style to mimic UML
Structure of category diagrams.
<style>
.class-box {
width: 200px;
border: 1px solid black;
margin: 10px;
}
.header {
background-color: lightblue;
text-align: center;
font-weight: bold;
}
.attributes, .methods {
padding: 10px;
border-top: 1px solid black;
}
</style>
<div class="class-box">
<div class="header">ClassName</div>
<div class="attributes">
+ attribute1 : Type <br>
+ attribute2 : Type
</div>
<div class="methods">
+ method1() : ReturnType <br>
+ method2() : ReturnType
</div>
</div>
To draw more complex UML diagrams in HTML, you can use an external JavaScript library like mermaid.js. It supports a variety of UML diagrams and can be directly embedded in HTML. First you need to reference mermaid.js and then use<pre>Tags compose UML diagram definitions.
<script type="module">
import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.esm.min.mjs';
mermaid.initialize({ startOnLoad: true });
</script>
<pre class="mermaid">
classDiagram
Class01 <|-- Class02 : Inheritance
Class01 : +method1() void
Class02 : +method2() void
Class03 : +attribute int
Class04 : +method() void
</pre>
classDiagram
Class01 <|-- Class02 : Inheritance
Class01 : +method1() void
Class02 : +method2() void
Class03 : +attribute int
Class04 : +method() void
Such an example can easily use mermaid.js to draw more complex and clear UML diagrams, and supports different diagram types.
This example shows inheritance, composition, aggregation, and association between categories.
<pre class="mermaid">
classDiagram
Animal <|-- Mammal
Animal <|-- Bird
Mammal o-- Dog : has-a
Bird --> Wing : has-a
class Animal {
+String name
+int age
+eat() void
}
class Mammal {
+hasFur() bool
}
class Dog {
+bark() void
}
class Bird {
+fly() void
}
class Wing {
+wingSpan int
}
</pre>
classDiagram
Animal <|-- Mammal
Animal <|-- Bird
Mammal o-- Dog : has-a
Bird --> Wing : has-a
class Animal {
+String name
+int age
+eat() void
}
class Mammal {
+hasFur() bool
}
class Dog {
+bark() void
}
class Bird {
+fly() void
}
class Wing {
+wingSpan int
}
illustrate:This example shows several relationships:
AnimalyesMammalandBirdsuper category.MammalandDogThe combination relationship ofo--express.BirdandWingThe aggregation relationship of-->express.This example shows how to represent multiplicity (1..*, 0..1, etc.) and roles between categories.
<pre class="mermaid">
classDiagram
Customer "1" --> "0..*" Order : places
Order "1" --> "1" Payment : includes
class Customer {
+String name
+String email
+placeOrder() void
}
class Order {
+int orderId
+String date
+calculateTotal() float
}
class Payment {
+float amount
+String method
+processPayment() void
}
</pre>
classDiagram
Customer "1" --> "0..*" Order : places
Order "1" --> "1" Payment : includes
class Customer {
+String name
+String email
+placeOrder() void
}
class Order {
+int orderId
+String date
+calculateTotal() float
}
class Payment {
+float amount
+String method
+processPayment() void
}
illustrate:
CustomerCan have multipleOrder, eachOrderAll correspond to onePayment。placesandincludes)。This example shows how to define interfaces and abstract classes in Mermaid.js.
<pre class="mermaid">
classDiagram
class Shape {
<<_abstract_>>
+area() float
+perimeter() float
}
Shape <|-- Rectangle
Shape <|-- Circle
class Rectangle {
+width float
+height float
+area() float
+perimeter() float
}
class Circle {
+radius float
+area() float
+perimeter() float
}
</pre>
classDiagram
class Shape {
<<_abstract_>>
+area() float
+perimeter() float
}
Shape <|-- Rectangle
Shape <|-- Circle
class Rectangle {
+width : float
+height : float
+area() float
+perimeter() float
}
class Circle {
+radius : float
+area() float
+perimeter() float
}
illustrate:
Shapeis an abstract category, with<<abstract>>mark.RectangleandCircleinheritedShapeand implement its methods.This example shows a mix of class inheritance and interface implementation.
<pre class="mermaid">
classDiagram
class Flyable {
<<_interface_>>
+fly() void
}
class Bird {
+String species
+String color
+sing() void
}
class Airplane {
+String model
+int capacity
+takeOff() void
}
Bird ..|> Flyable : implements
Airplane ..|> Flyable : implements
</pre>
classDiagram
class Flyable {
<<_interface_>>
+fly() void
}
class Bird {
+String species
+String color
+sing() void
}
class Airplane {
+String model
+int capacity
+takeOff() void
}
Bird ..|> Flyable : implements
Airplane ..|> Flyable : implements
illustrate:
Flyableis the interface and definesfly()method.BirdandAirplaneAll implementedFlyableinterface, use..|>Symbol representation.flowchart TD
A[Start] --> B{Do you need to continue? }
B -- Yes --> C[Perform operation]
B -- No --> D[End]
C --> D
Mermaid offers officialMermaid Live Editor, you can test and check for syntax errors on the fly. After pasting the Mermaid syntax, if there are errors, the editor will display specific error messages, allowing you to troubleshoot the problem faster.
If your Mermaid chart is too complex, segmented testing is recommended. For example, by removing some categories or relationships first, leaving only the most basic structure, and gradually adding elements, you can more quickly identify possible sources of grammatical errors.
Different versions of Mermaid.js may have different support for the syntax. Make sure you are using the latest version, or verify in a test environment that your version of Mermaid.js supports the syntax features used.
+attribute : typeto mark attributes.<|--or-->) is correct, these symbols are used to represent categorical relationships.<<abstract>>or other special marks, it is recommended to remove or modify them one by one to confirm whether errors are caused.View the JavaScript console in the browser's developer tools. If the Mermaid chart is not generated correctly, specific error messages or tips may be displayed in the console to help you identify syntax errors.
Mermaid official documentation provides detailed grammar guidelines to help you confirm whether the grammar is used correctly. Official documents are located atMermaid.js official website。
Below is a simple flowchart example illustrating the logical relationship between decisions and actions.
flowchart TD
A[Start] --> B{Do you need to continue? }
B -- Yes --> C[Perform operation]
B -- No --> D[End]
C --> D
Paste the flowchart syntax above into a Mermaid-enabled tool, such as the Markdown editor or the Mermaid online tool, to generate the graph.
This JavaScript library adds scalable slider functionality to Mermaid.js charts, allowing users to<input type="range">Controls the scaling of the chart. Library usagetransform: scale()Enables visual scaling without re-rendering Mermaid.
// mermaidZoomSlider.js
export function setupMermaidZoomSlider({
sliderId = "zoomSlider",
diagramContainerId = "mermaidContainer",
min = 0.1,
max = 3,
step = 0.1,
initial=1
} = {}) {
window.addEventListener("load", () => {
const slider = document.getElementById(sliderId);
const container = document.getElementById(diagramContainerId);
if (!slider || !container) {
console.warn("Mermaid zoom slider: Missing slider or container element");
return;
}
//Initialize slider properties
slider.min = min;
slider.max = max;
slider.step = step;
slider.value = initial;
//Set initial zoom
container.style.transformOrigin = "top left";
container.style.transform = `scale(${initial})`;
//Event listening: zoom
slider.addEventListener("input", () => {
const scale = parseFloat(slider.value);
container.style.transform = `scale(${scale})`;
});
});
}
<!-- HTML -->
<div>
<input type="range" id="zoomSlider">
</div>
<div id="mermaidContainer">
<pre class="mermaid">
graph TD;
A-->B;
B-->C;
</pre>
</div>
<!-- JavaScript module introduction -->
<script type="module">
import mermaid from "https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.esm.min.mjs";
import { setupMermaidZoomSlider } from "./mermaidZoomSlider.js";
mermaid.initialize({ startOnLoad: true });
setupMermaidZoomSlider({
sliderId: "zoomSlider",
diagramContainerId: "mermaidContainer",
min: 0.2,
max: 3,
step: 0.1,
initial: 1
});
</script>
If you need advanced functions such as dragging and moving, zoom reset, etc., you can further expand this function library, such as integrating mouse drag and zoom reset buttons.
Used in Mermaid.js diagrams-->、===>Markers are used to establish connections between nodes. Different symbols represent different line styles.
| grammar | style | illustrate |
|---|---|---|
--> |
──> | General solid arrow |
---> |
───> | and-->Identical (syntax tolerant) |
-- text --> |
── text ──> | solid arrow with text label |
-.-> |
-.-> | dashed arrow |
-. text .-> |
-. text .-> | Dotted arrow with text |
==> |
===> | thick solid arrow |
== text ==> |
== text ==> | Thick arrow with text |
--o |
──○ | Round head without direction line (commonly used in class diagram) |
--|> |
──▷ | Solid arrow (commonly used in class diagram) |
--> | label | |
──> (with double-sided text) | Mermaid supports style annotation tags |
graph TD
A[Start] --> B[Step 1]
B -.-> C [Asynchronous processing]
C ==> D[strongly dependent]
D -- text --> E [connection with text]
E --o F[round head]
F --|> G[solid arrow]
graph), category map (classDiagram), state diagram and other styles are slightly different.Mermaid.js provides a variety of line syntax styles, allowing users to clearly express processes, logic, and relationships. Through the combination of solid lines, dotted lines, thick lines, and graphic endpoints, simple and well-structured diagrams can be created.
D3.js (Data-Driven Documents) is an open source JavaScript-based library for transforming data into dynamic and interactive visualizations. It uses web standard technologies such as SVG, HTML and CSS, provides powerful tools for manipulating data and drawing graphics.
d3.select()andd3.selectAll()。
D3.js is widely used in various data visualization scenarios, such as:
To learn D3.js, you can refer to the following resources:
D3.js is a powerful and flexible data visualization tool for developers who require highly customized charts and interactive effects. Although the learning curve is slightly higher, once mastered, its application potential is endless.
This example uses D3.js to draw a simple tree diagram to show how to visualize hierarchical structure data. Here are the main steps:
tree()The function generates a tree layout.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>D3.js Tree Diagram Example</title>
<script src="https://d3js.org/d3.v7.min.js"></script>
<style>
.node circle {
fill: steelblue;
}
.node text {
font: 12px sans-serif;
}
.link {
fill: none;
stroke: #ccc;
stroke-width: 1.5px;
}
</style>
</head>
<body>
<script>
const width = 800;
const height = 600;
const treeData = {
name: "CEO",
children: [
{
name: "CTO",
children: [
{ name: "Engineering Manager" },
{ name: "Product Manager" }
]
},
{
name: "CFO",
children: [
{ name: "Accountant" },
{ name: "Finance Analyst" }
]
}
]
};
const svg = d3.select("body")
.append("svg")
.attr("width", width)
.attr("height", height)
.append("g")
.attr("transform", "translate(40,40)");
const treeLayout = d3.tree().size([height - 100, width - 160]);
const root = d3.hierarchy(treeData);
treeLayout(root);
svg.selectAll(".link")
.data(root.links())
.enter()
.append("path")
.attr("class", "link")
.attr("d", d3.linkHorizontal()
.x(d => d.y)
.y(d => d.x)
);
const nodes = svg.selectAll(".node")
.data(root.descendants())
.enter()
.append("g")
.attr("class", "node")
.attr("transform", d => `translate(${d.y},${d.x})`);
nodes.append("circle").attr("r", 5);
nodes.append("text")
.attr("dy", 3)
.attr("x", d => d.children ? -10 : 10)
.style("text-anchor", d => d.children ? "end" : "start")
.text(d => d.data.name);
</script>
</body>
</html>
After running this code, you will see a tree diagram:
CEO)。CTOandCFO) expand to the right.This example can be extended to more complex hierarchies, or the style adjusted to suit different needs. For example:
Rectangular treemap is a visualization technique that uses nested rectangles to display hierarchical data. The area of each rectangle represents a numerical value, such as sales or archive size, and each rectangle can be further nested to represent subcategories.
Cytoscape.js is a JavaScript library used to draw network graphs (Graph). It uses JSON to define nodes and edges. It has simple syntax and supports interaction and style customization.
elements: Define node and edge datastyle: Control styles such as colors, arrows, labelslayout: Graphic arrangement (optional grid, circle, cose, etc.)classesDefine node groups and assign different styles| library name | applicability | Features | Whether to support interaction | illustrate |
|---|---|---|---|---|
| JointJS | ★★★★★ | High drawing freedom and scalable circuit component symbols | ✔️ | It can draw logic circuits and flow charts. The free version has enough functions. |
| GoJS | ★★★★☆ | Powerful graphics and data model support | ✔️ | Not free software, but there is a free trial; often used in production line diagrams and circuit diagrams |
| SVG.js | ★★★☆☆ | Lightweight and supports precise drawing | ✔️ | Requires self-designed components (resistors, capacitors, etc.), suitable for detailed control |
| Konva.js | ★★★☆☆ | Both Canvas and SVG are supported | ✔️ | Design tools suitable for interactive behaviors such as dragging and clicking |
| ELK.js | ★★☆☆☆ | Excellent automatic layout | ✖️ | Only responsible for layout algorithm (can be paired with JointJS) |
This tool can be used asxandyas an independent variablez = f(x, y)Equations are drawn into 3D surface diagrams, and interactive functions of mouse operation for rotation, scaling, and translation are provided.
<div id="plot3d" style="width:100%; height:600px;"></div>
<script src="https://cdn.plot.ly/plotly-latest.min.js"></script>
<script type="module">
// Define the function z = f(x, y) (can be replaced by any equation)
function computeZ(x, y) {
return Math.sin(x) * Math.cos(y); // z = sin(x) * cos(y)
}
const xRange = numeric.linspace(-5, 5, 50);
const yRange = numeric.linspace(-5, 5, 50);
//Create z data
const zValues = xRange.map(x =>
yRange.map(y => computeZ(x, y))
);
const data = [{
type: 'surface',
x: xRange,
y: yRange,
z: zValues,
colorscale: 'Viridis'
}];
const layout = {
title: 'z = sin(x) * cos(y)',
autosize: true,
scene: {
xaxis: { title: 'X axis' },
yaxis: { title: 'Y axis' },
zaxis: { title: 'Z axis' }
}
};
Plotly.newPlot('plot3d', data, layout);
</script>
<!-- numeric.js is used to generate linspace arrays -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/numeric/1.2.6/numeric.min.js"></script>
z = Math.sin(x * y)→ corrugatedz = x * x - y * y→ Saddle surfacez = Math.exp(-(x * x + y * y))→ Gauss PeakThis example uses Plotly.js to provide interactive 3D visualization, and numeric.js to assist in generating numerical grids. You can change it freelycomputeZUse the contents in the function to draw any three-dimensional surface.
3Dmol.js is an open source WebGL chemical molecule visualization library designed specifically for browsers, which can draw molecular structures directly on web pages.
<div id="viewer" style="width:400px;height:400px;"></div>
<script src="https://3dmol.org/build/3Dmol-min.js"></script>
<script>
const viewer = $3Dmol.createViewer("viewer", { backgroundColor: "white" });
viewer.addModel("C1=CC=CC=C1", "smi"); // SMILES structure of benzene
viewer.setStyle({}, {stick: {}, sphere: {scale: 0.3}});
viewer.zoomTo();
viewer.render();
</script>
ChemDoodle provides 2D and 3D structure drawings, supports a variety of chemical formats, and is suitable for teaching and web applications.
JSmol is a JavaScript version of Jmol, suitable for displaying large molecules such as proteins or crystal structures.
Mol* (MolStar) is a high-order structure visualization tool developed by RCSB PDB, specifically designed for biological macromolecules.
| Function library | Main purpose | Is it open source? | Whether authorization is required |
|---|---|---|---|
| 3Dmol.js | Universal 3D molecular visualization | ✅ | ❌ |
| ChemDoodle | 2D and 3D teaching and display | part | ✅ |
| JSmol | Academic research and teaching | ✅ | ❌ |
| Mol* | Protein and biomolecule visualization | ✅ | ❌ |
This example uses3Dmol.jsand adoptXYZ formatDefine the atomic coordinates of the benzene molecule to correctly display the 3D molecular structure.
An error occurs if the SMILES format ("smi") is usedUnknown format: smi, because this format is not supported in some 3Dmol.js versions.
benzene.html。python -m http.server) is turned on.http://localhost:8000View results.<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>3Dmol.js renders benzene molecules</title>
<script src="https://3dmol.org/build/3Dmol-min.js"></script>
<style>
#viewer {
width: 600px;
height: 600px;
position: relative;
border: 1px solid #aaa;
}
</style>
</head>
<body>
<h2>3Dmol.js benzene molecule display (XYZ format)</h2>
<div id="viewer"></div>
<script>
document.addEventListener("DOMContentLoaded", function () {
const viewer = $3Dmol.createViewer("viewer", { backgroundColor: "white" });
const xyzData = `
12
benzene
C 0.0000 1.3968 0.0000
H 0.0000 2.4903 0.0000
C -1.2096 0.6984 0.0000
H -2.1471 1.2451 0.0000
C -1.2096 -0.6984 0.0000
H -2.1471 -1.2451 0.0000
C 0.0000 -1.3968 0.0000
H 0.0000 -2.4903 0.0000
C 1.2096 -0.6984 0.0000
H 2.1471 -1.2451 0.0000
C 1.2096 0.6984 0.0000
H 2.1471 1.2451 0.0000
`;
viewer.addModel(xyzData, "xyz");
viewer.setStyle({}, {stick: {}, sphere: {scale: 0.3}});
viewer.zoomTo();
viewer.render();
});
</script>
</body>
</html>
file://Enable HTML. It is recommended to use a local HTTP server.The Google Maps JavaScript API allows developers to embed interactive maps into web pages. And dynamically add custom elements such as markers, layers, and text labels through JavaScript. The following example demonstrates how to display a map and add custom markers.
---Go toGoogle Cloud Console, enableMaps JavaScript API,
And create a set of API keys (API Key).
After obtaining it, append it when loading script?key=YOUR_API_KEY。
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Google Map with Custom Tags</title>
<style>
#map {
width: 100%;
height: 500px;
}
</style>
</head>
<body>
<h3>My Map</h3>
<div id="map"></div>
<!-- Load Google Maps JS API -->
<script async
src="https://maps.googleapis.com/maps/api/js?key=YOUR_API_KEY&callback=initMap">
</script>
<script>
function initMap() {
//Initialize the map
const center = { lat: 25.033964, lng: 121.564468 }; // Taipei 101
const map = new google.maps.Map(document.getElementById("map"), {
zoom: 14,
center: center
});
//Create custom tag
const myTags = [
{ position: { lat: 25.034, lng: 121.565 }, title: "Mark A", content: "This is point A" },
{ position: { lat: 25.036, lng: 121.562 }, title: "Mark B", content: "This is point B" },
{ position: { lat: 25.032, lng: 121.568 }, title: "Mark C", content: "This is point C" }
];
//Create an information window (InfoWindow)
const infoWindow = new google.maps.InfoWindow();
//Add markers to the map
myTags.forEach(tag => {
const marker = new google.maps.Marker({
position: tag.position,
map: map,
title: tag.title,
icon: {
url: "https://maps.google.com/mapfiles/ms/icons/blue-dot.png"
}
});
// Click to display information
marker.addListener("click", () => {
infoWindow.setContent("<b>" + tag.title + "</b><br>" + tag.content);
infoWindow.open(map, marker);
});
});
}
</script>
</body>
</html>
---
iconPictures for yourself (PNG, SVG).google.maps.PolylineorPolygonDraw lines and areas.map.setMapTypeId('satellite')Switch to satellite mode.| property | use |
|---|---|
center | Set the initial center coordinates of the map. |
zoom | Map zoom level (1–20). |
mapTypeId | Display style, which can beroadmap、satellite、hybrid、terrain。 |
icon | Custom mark icon. |
infoWindow | Displays the information window after clicking the mark. |
To play a specific MIDI sound (such as a guitar) in a browser, you can use the Web MIDI API or, more simply, the Web Audio API with a SoundFont player, such asSoundFont Playerkitsoundfont-player。
<script src="https://unpkg.com/[email protected]/dist/soundfont-player.js"></script>
<button onclick="playDoReMi()">Play Do Re Mi</button>
<script>
async function playDoReMi() {
const audioCtx = new (window.AudioContext || window.webkitAudioContext)();
const player = await Soundfont.instrument(audioCtx, 'acoustic_guitar_nylon');
const now = audioCtx.currentTime;
player.play('C4', now); // Do
player.play('D4', now + 0.5); // Re
player.play('E4', now + 1); // Mi
}
</script>
C4, D4, E4:Represents Do Re Miacoustic_guitar_nylon: SoundFont's guitar tone (can also be changed to electric_guitar_jazz, etc.)acoustic_guitar_nylonacoustic_guitar_steelelectric_guitar_jazzelectric_guitar_cleanelectric_guitar_mutedoverdriven_guitardistortion_guitarguitar_harmonicsusesoundfont-playerWith the Web Audio API, you can easily implement MIDI-level instrument playback functions without installing any plug-ins. Just specify the timbre and pitch, and you can quickly implement a scale melody such as "do re mi".
If the sound cannot be played through an external SoundFont, we can use it directlyWeb Audio APIofOscillatorNodeThe synthesizer plays Do Re Mi and simulates a guitar style (eg: short sound + pianissimo)
<button onclick="playDoReMi()">Play Do Re Mi</button>
<script>
function playTone(frequency, startTime, duration, context) {
const osc = context.createOscillator();
const gain = context.createGain();
osc.type = "triangle"; // Synthetic waveform close to guitar sound, can be changed to "square", "sawtooth"
osc.frequency.value = frequency;
gain.gain.setValueAtTime(0.2, startTime);
gain.gain.exponentialRampToValueAtTime(0.001, startTime + duration);
osc.connect(gain);
gain.connect(context.destination);
osc.start(startTime);
osc.stop(startTime + duration);
}
function playDoReMi() {
const context = new (window.AudioContext || window.webkitAudioContext)();
const now = context.currentTime;
// Frequency of Do Re Mi (C4, D4, E4)
playTone(261.63, now, 0.4, context); // C4
playTone(293.66, now + 0.5, 0.4, context); // D4
playTone(329.63, now + 1.0, 0.4, context); // E4
}
</script>
osc.typeOr add a filter to simulate a guitar tonetriangleorsawtoothwaveformgain.exponentialRampToValueAtTime()) simulates pickingUsing pure Web Audio API is the most stable and compatible way. If you have advanced needs, you can add filters, echo, or integrate MIDI sound sources.
| osc.type | Chinese name | timbre characteristics | Common analog instruments |
|---|---|---|---|
"sine" |
sine wave | The purest, no harmonics | Pure tone, tuning fork, flute, electronic synthesized tone |
"square" |
square wave | Rich odd harmonics and sharp timbre | Synthesizer, 8-bit sound effects, electronic keyboard |
"sawtooth" |
sawtooth wave | Contains all harmonics for a thick and bright tone | Strings, Guitar, Brass Simulation |
"triangle" |
triangle wave | Only odd harmonics, softer sound | Woodwinds, soft electric guitar sounds |
"custom" |
Custom waveform | Customizable arbitrary waveforms | Special synthesized sounds, real analog sounds |
const osc = audioContext.createOscillator();
osc.type = "sawtooth"; // Can be changed to "sine", "square", "triangle", "custom"
osc.frequency.value = 440; // A4
osc.start();
const real = new Float32Array([0, 1, 0.5, 0.25]);
const imag = new Float32Array(real.length);
const wave = audioContext.createPeriodicWave(real, imag);
osc.setPeriodicWave(wave);
osc.type = "custom";
differentosc.typeCan simulate different styles of musical instrument sounds. If you want to simulate a guitar, it is recommended to start fromsawtoothortriangleGet started and fine-tune the sound with Envelopes, Filters and Echoes.
RecommendedWebAudioFontThis open source JavaScript library supports more than thousands of MIDI sounds, including guitar and other instrument sounds, with better sound quality and easy integration.
'acoustic_guitar_steel''acoustic_guitar_nylon''electric_guitar_clean' ⋯⋯By combining WebAudioFont with the Web Audio API, you can easily use real MIDI sounds (such as guitars) to play notes, solving the problem of a single pure Oscillator synthesized sound and avoiding the previous silent situation of SoundFont players.
email: [email protected]