pytorch安裝及環境配置的完整過程_python 詳情 - 3d,數據,點雲,JavaScript,前端開發 bigrobin 博客

寫在最前面：現在AI的語言表達能力越來越發達了，筆者這裏一定要説明的是本人的所有文章的寫作都是手敲文字，沒有使用AI幫助寫作，所以如果覺得文章不錯請點關注。

文章目錄

一、主題説明
二、環境配置

關於GPU渲染加速的説明
requirements.txt依賴庫

三、3D渲染基礎知識
四、網格+渲染（含漸變）
五、點雲+渲染
六、360°旋轉GIF動圖生成

6.1 從Mesh網格到GIF
6.2 從PointCloud到GIF

一、主題説明

本篇博文實驗內容參考自MIT實驗課程：16-825 Assignment 1: Rendering Basics with PyTorch3D (Total: 100 Points + 10 Bonus)，素材相同，但是結合筆者自己的經驗做了詳細的講解，還加入了一些內容的修改和創新。

二、環境配置

原實驗的README.md文檔針對的是Linux系統，不過Windows系統（10/11)同樣支持，把MAX_JOBS=8參數刪除進行適配，需要先安裝anaconda工具。

# GPU Installation on a CUDA 11.6 Machine
conda create -n learning3d python=3.10
pip install torch --index-url https://download.pytorch.org/whl/cu116 (modify according to your cuda version)
pip install fvcore iopath
pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable" (this will take some time to compile)
pip install -r requirements.txt
# CPU Installation
conda create -n learning3d python=3.10
pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install fvcore iopath
pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"
pip install -r requirements.txt

如果pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"命令受到網絡因素的影響，可以更換為pip install "git+ssh://git@github.com/facebookresearch/pytorch3d.git@stable"。

關於GPU渲染加速的説明

關於cuda版本的問題，需要指出的是，對於GeForce RTX50系列顯卡（基於BlackWell架構），cuda版本一般只支持12.8以上，pytorch一般安裝的版本需要和CUDA版本適配，而pytorch3d庫採用源碼下載-本地編譯的流程，筆者目前測試到支持的CUDA版本包括12.1和12.4, 11.6和11.8應該也是支持的。所以RTX 50系列的用户就只能選擇CPU渲染了。

pytorch安裝及環境配置的完整過程_python_點雲

pytorch安裝及環境配置的完整過程_python_點雲_02

pytorch安裝及環境配置的完整過程_python_數據_03

requirements.txt依賴庫

imageio
matplotlib
numpy
PyMCubes
tqdm
scipy
plotly

三、3D渲染基礎知識

接下來講解一些關於計算機3D視覺的基礎知識，可以問問AI助手加深理解，歡迎評論區交流。

（1）3D數據的存儲形式包括點雲數據，參數化曲面，網格，隱式曲面和體素等等；RGBD圖像並不算真正意義上的3D數據，類似2.5D數據，需要經過計算處理轉化為其他形式的3D數據（一般是點雲數據）；自動駕駛領域汽車雷達採集的是點雲數據，攝像頭是深度攝像頭；

（2）在3D數據可視化的過程中，需要通過渲染轉化成RGB圖像——渲染需要提供的參數有：物體3D數據、攝像機位置和燈光位置；直接返回BxHxWx4的數據，裁剪之後變為HxWx3的RGB圖像；

（3）攝像機的可調節參數有相對世界座標系原點的平移距離（3維向量）、空間旋轉角度（每個相機都有3x3的旋轉矩陣進行描述）、FOV視場角；燈光的可調節參數為燈光位置（3維向量）；

（4）本實驗對原課程進行了刪減，主要探討在pytorch3d框架下，Mesh（網格數據）和PointCloud（點雲數據）的渲染可視化——Mesh包括N_v個3維點，N_f個三角曲面（每個曲面朝向由3個端點的序號決定），N_v個（和頂點數一致）三元素texture紋理信息決定；PointCloud由N個維點+N個RGB顏色向量決定，可以直接獲取或者從RGBD圖像計算。

四、網格+渲染（含漸變）

有了前面的基礎知識，只需要熟悉pytorch3d的接口函數就可以慢慢熟悉3D可視化的流程。從obj文件讀取小牛的頂點和三角面信息，然後傳入頂點、三角面、紋理數據構建mesh對象，最後指定相機和燈光位置完成渲染。

需要注意批量B維度的unsqueeze擴充和最後HxWx3的截取。

def render_setup(filepath="data/cow.obj", image_size=256, color1=None,color2=None,
Camera_R=None,Camera_T=None,device=None,savepath='01setup.jpg',record=True):
# The device tells us whether we are rendering with GPU or CPU. The rendering will
# be *much* faster if you have a CUDA-enabled NVIDIA GPU. However, your code will
# still run fine on a CPU.
# The default is to run on CPU, so if you do not have a GPU, you do not need to
# worry about specifying the device in all of these functions.
if device is None:
device = get_device()
# Get the renderer.
renderer = get_mesh_renderer(image_size=image_size)
# Get the vertices, faces, and textures.
vertices, faces = load_cow_mesh(filepath)
vertices = vertices.unsqueeze(0)  # (N_v, 3) -> (1, N_v, 3)
faces = faces.unsqueeze(0)  # (N_f, 3) -> (1, N_f, 3)
assert(color1 is not None)
if color1 and color2:
color1=varying_color(vertices,color1,color2)
textures = torch.ones_like(vertices)  # (1, N_v, 3)
textures = textures * torch.tensor(color1)  # (1, N_v, 3)
mesh = pytorch3d.structures.Meshes(
verts=vertices,
faces=faces,
textures=pytorch3d.renderer.TexturesVertex(textures),
)
mesh = mesh.to(device)
print(torch.eye(3).unsqueeze(0))
print(Camera_R)
print(Camera_T)
# Prepare the camera:
cameras = pytorch3d.renderer.FoVPerspectiveCameras(
R=torch.eye(3).unsqueeze(0) if Camera_R is None else Camera_R, T=torch.tensor([[0, 0, 3]] if Camera_T is None else Camera_T),
fov=60, device=device)
# Place a point light in front of the cow.
lights = pytorch3d.renderer.PointLights(location=[[0, 0, -3]], device=device)
rend = renderer(mesh, cameras=cameras, lights=lights)
rend = rend.cpu().numpy()[0, ..., :3]  # (B, H, W, 4) -> (H, W, 3)
# The .cpu moves the tensor to GPU (if needed).
if record:
if '/' in savepath:
dir = ''.join(savepath.split('/')[:-1])
os.makedirs(dir, exist_ok=True)
os.chdir(dir)
savepath = savepath.split('/')[-1]
if Camera_R is not None and Camera_T is not None:
plt.imsave(f'{Camera_R[0].flatten().numpy(),Camera_T[0].numpy()}'+savepath,numpy.uint8(rend*255))
else:
plt.imsave(savepath,numpy.uint8(rend*255))
return rend

pytorch安裝及環境配置的完整過程_python_數據_04

pytorch安裝及環境配置的完整過程_python_點雲_05

這個時候如果想要顏色更豐富的小牛，我們可以按照牛頭到牛身到牛尾（或者説距離上圖鏡頭的遠近）做一個漸變色的線性渲染，效果還是不錯的，藝術家們此時就可以根據color1前端顏色和color2後端顏色渲染出五顏六色的自己的小牛了：

pytorch安裝及環境配置的完整過程_python_點雲_06

pytorch安裝及環境配置的完整過程_python_點雲_07

pytorch安裝及環境配置的完整過程_python_3d_08

def varying_color(vertices,color1,color2):
assert(vertices.shape[0]==1)
z = vertices[0, :, 2]
z_min = torch.min(vertices[0, :, 2])
z_max = torch.max(vertices[0, :, 2])
color1 = torch.tensor(color1).view(1, 3)
color2 = torch.tensor(color2).view(1, 3)
alpha = (z - z_min) / (z_max - z_min)
var_color = torch.matmul(alpha.reshape(-1,1), color2) + torch.matmul(1 - alpha.reshape(-1, 1), color1)
var_color=var_color.unsqueeze(0)
assert(var_color.shape==vertices.shape)
return var_color

五、點雲+渲染

對應MIT實驗1任務5.1，先從RGBD圖像數據使用unproject_depth_image轉成點雲數據，一共用到pcloud，pcloud2和pcloud_cb三組點雲，對應第一株植物、第二株植物和兩株植物。

pytorch安裝及環境配置的完整過程_python_數據_09

pytorch安裝及環境配置的完整過程_python_3d_10

pytorch安裝及環境配置的完整過程_python_點雲_11

pytorch安裝及環境配置的完整過程_python_3d_12

pytorch安裝及環境配置的完整過程_python_數據_13

pytorch安裝及環境配置的完整過程_python_點雲_14

pytorch安裝及環境配置的完整過程_python_數據_15

def render_setup_from_pointcloud(filepath, image_size=256,
Camera_R=None,Camera_T=None,device=None,savepath='01setup.jpg',record=True):
if device is None:
device = get_device()
# Get the renderer.
renderer = get_points_renderer(image_size=image_size,radius=0.01)
# Get the vertices, faces, and textures.
data = load_rgbd_data(filepath)
print(data.keys())
# Prepare the camera:
camera_fixed = pytorch3d.renderer.FoVPerspectiveCameras(
R=torch.eye(3).unsqueeze(0), T=torch.tensor([[0, 0, 3]]),
fov=60, device=device)
points,rgbs=unproject_depth_image(torch.tensor(data["rgb1"]),torch.tensor(data["mask1"]),
torch.tensor(data["depth1"]),camera_fixed)
points2,rgbs2=unproject_depth_image(torch.tensor(data["rgb2"]),torch.tensor(data["mask2"]),
torch.tensor(data["depth2"]),camera_fixed)
pcloud = pytorch3d.structures.Pointclouds(
points=points.unsqueeze(0),
features=rgbs.unsqueeze(0)
)
pcloud = pcloud.to(device)
pcloud2 = pytorch3d.structures.Pointclouds(
points=points2.unsqueeze(0),
features=rgbs2.unsqueeze(0)
)
pcloud2 = pcloud2.to(device)
pcloud_cb = pytorch3d.structures.Pointclouds(
points=torch.cat([points,points2],dim=0).unsqueeze(0),
features=torch.cat([rgbs,rgbs2],dim=0).unsqueeze(0)
)
pcloud_cb = pcloud_cb.to(device)
pclouds=[pcloud,pcloud2,pcloud_cb]
rends=[]
# Place a point light in front of the cow.
lights = pytorch3d.renderer.PointLights(location=[[0, 0, -3]], device=device)
camera_dynamic=pytorch3d.renderer.FoVPerspectiveCameras(
R=torch.eye(3).unsqueeze(0) if Camera_R is None else Camera_R,
T=torch.tensor([[0, 0, 3]] if Camera_T is None else Camera_T),
fov=60, device=device)
for i,pcloud in enumerate(pclouds):
rend = renderer(pcloud, cameras=camera_dynamic, lights=lights)
rend = rend.cpu().numpy()[0, ..., :3]  # (B, H, W, 4) -> (H, W, 3)
# The .cpu moves the tensor to GPU (if needed).
if record:
if '/' in savepath:
dir=''.join(savepath.split('/')[:-1])
os.makedirs(dir,exist_ok=True)
os.chdir(dir)
savepath=savepath.split('/')[-1]
if Camera_R is not None and Camera_T is not None:
plt.imsave(f'{i+1}'+f'{float(Camera_R[0].numpy().sum()), float(Camera_T[0].numpy().sum())}'+savepath , numpy.uint8(rend * 255))
else:
plt.imsave(f'{i+1}'+savepath, numpy.uint8(rend * 255))
rends.append(rend)
return rends

點雲的渲染方式和網格總體相近，需要額外注意的是在從RGBD圖像到點雲的計算過程需要指定相機的位置和角度、視場角等參數；構建點雲數據需要點和顏色信息。

六、360°旋轉GIF動圖生成

6.1 從Mesh網格到GIF

這裏需要變化的就是渲染__call__函數裏面的相機，通過renderer腳本camera類的look_at_view_transform方法計算當前相機的R矩陣（3x3旋轉）和T向量（3平移）（需要指定距離世界座標系原點的dist距離和azim的z軸夾角，筆者的猜測），然後創建動態相機，燈光位置保持[[0,0,-3]]不變。

pytorch安裝及環境配置的完整過程_python_點雲_16

dist=3，color1=[1,0.3,0.3]，color2=None，幀率15，時長4秒

pytorch安裝及環境配置的完整過程_python_3d_17

dist=4，color1=[1,0.3,0.3]，color2=[0.3,0.3,1]，幀率15，時長4秒

def gif_360(n_render,color1,color2,savepath):
current_dir=os.getcwd()
my_images = []
for i in range(0, n_render):
R, T = pytorch3d.renderer.cameras.look_at_view_transform(dist=3, azim=180 + 360 * i / n_render)
image = render_setup(filepath=args.cow_path, image_size=args.image_size,color1=color1,color2=color2,
Camera_R=R, Camera_T=T,record=False)
my_images.append(numpy.uint8(image[:, :, :] * 255))
print(i, "/", n_render)
if '/' in savepath:
os.makedirs(''.join(savepath.split('/')[:-1]),exist_ok=True)
os.chdir(current_dir)
imageio.mimwrite(savepath, my_images, fps=24)

6.2 從PointCloud到GIF

注意這裏從RGBD到PointCloud的相機必須固定，如果都使用look_at_view_transform獲得的R/T動態相機，得到的是完全不動的靜態圖片，冠以“gif”之名。

pytorch安裝及環境配置的完整過程_python_點雲_18

這張gif幀率為24，時長9秒，但是其實是第4部分3個點雲渲染的結合體，所以是鬼畜植物，實際幀率為8。

def gif_360_pcloud(n_render,savepath):
if savepath.split('.')[-1]!='gif':
raise ValueError("Savepath should be only in the format of gif.")
current_dir=os.getcwd()
my_images = []
for i in range(0, n_render):
R, T = pytorch3d.renderer.cameras.look_at_view_transform(dist=4, azim=180 + 360 * i / n_render)
images = render_setup_from_pointcloud(filepath=args.bridge_path, image_size=args.image_size,
Camera_R=R, Camera_T=T,savepath=savepath.replace("gif","jpg"),record=False)
for img in images:
my_images.append(numpy.uint8(img[:, :, :] * 255))
print(i, "/", n_render)
os.chdir(current_dir)
imageio.mimwrite(savepath, my_images, fps=24)
return my_images

寫在最後面：本實驗主要用到pytorch3d庫的structures和renderer腳本。

本文章為轉載內容，我們尊重原作者對文章享有的著作權。如有內容錯誤或侵權問題，歡迎原作者聯繫我們進行內容更正或刪除文章。

bigrobin 博客

bigrobin 博客

博客 / 詳情