We put forward an approach for automated skeleton rigging of 3D point cloud models of segmented characters. Unlike earlier systems that fit predetermined skeleton templates or forecast predetermined sets of joints, our approach generates an animation skeleton that is tuned to the structure and geometry of the input 3D model. Our architecture is built on a stack of hourglass models trained using a large dataset of 3D-rigged characters mined from the web. It works with a volumetric representation of the input 3D shapes enhanced with geometric shape elements that provide different indications for joint and bone positions. The proposed method also allows straightforward user customization of the output skeleton’s level of detail. Our study shows that, compared to many alternatives and baselines, our approach predicts animation skeletons that are significantly more comparable to those made by people.

1.
Zhan
Xu
,
Yang
Zhou
,
Evangelos
Kalogerakis
, and
Karan
Singh
.
They are predicting animation skeletons for 3d articulated models via volumetric nets
. In
2019 International Conference on 3D Vision (3DV)
, pages
298
307
. IEEE,
2019
.
2.
Stewart,
Robert
F.
Automatic Rigging of Animation Skeletons with Stacked Hourglass Networks and Curve Skeleton Extraction
.”
KANSAS STATE UNIVERSITY
,
2021
.
3.
A.
Newell
,
K.
Yang
, and
J.
Deng
.
Stacked hourglass networks for human pose estimation
.
In Proc. ECCV
,
2016
4.
M. M.
Bronstein
,
J.
Bruna
,
Y.
LeCun
,
A.
Szlam
, and
P.
Vandergheynst
.
Geometric deep learning: Going beyond euclidean data
.
IEEE Signal Processing Magazine
,
34
(
4
),
2017
.
5.
Osada
,
R.
,
Funkhouser
,
T.
,
Chazelle
,
B.
,
Dobki
,
D.
:
Shape distributions
.
ACM Transactions on Graphics
(
2002
)
807
832
6.
Jing
Yang
,
Qingshan
Liu
, and
Kaihua
Zhang
.
Stacked hourglass network for robust facial landmark localisation
. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops
, pages
79
87
,
2017
.
7.
Fuyang
Huang
,
Ailing
Zeng
,
Minhao
Liu
,
Jing
Qin
, and
Qiang
Xu
.
Structure-aware 3d hourglass network for hand pose estimation from single depth image
. arXiv preprint arXiv:1812.10320,
2018
.
8.
Xiaokun
Wu
,
Daniel
Finnegan
,
Eamonn
O’Neill
, and
Yong-Liang
Yang
.
Handmap: Robust hand pose estimation via intermediate dense guidance map supervision
. In
Proceedings of the European Conference on Computer Vision (ECCV)
, pages
237
253
,
2018
.
9.
Zhirong
Wu
,
Shuran
Song
,
Aditya
Khosla
,
Fisher
Yu
,
Linguang
Zhang
,
Xiaoou
Tang
, and
Jianxiong
Xiao
.
3d shapenets: A deep representation for volumetric shapes
. In
Proceedings of the IEEE conference on computer vision and pattern recognition
, pages
1912
1920
,
2015
.
10.
Daniel
Maturana
and
Sebastian
Scherer
.
Voxnet: A 3d convolutional neural network for real-time object recognition
. In
2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
, pages
922
928
. IEEE,
2015
.
11.
Gernot
Riegler
,
Ali Osman
Ulusoy
, and
Andreas
Geiger
.
Octnet: Learning deep 3d representations at high resolutions
. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, pages
3577
3586
,
2017
.
12.
Gyeongsik
Moon
,
Ju Yong
Chang
, and
Kyoung Mu
Lee
.
V2v-posenet: Voxel-to-voxel prediction network for accurate 3d hand and human pose estimation from a single depth map
. In
Proceedings of the IEEE conference on computer vision and pattern Recognition
, pages
5079
5088
,
2018
.
13.
Alexandre
,
L.A.
(
2012
).
3D Descriptors for Object and Category Recognition: a Comparative Evaluation
.
14.
Tang
,
S.
, &
Godil
,
A.
(
2012
, January).
An evaluation of local shape descriptors for 3D shape retrieval
. In
Three-Dimensional Image Processing (3DIP) and Applications II
(Vol.
8290
, p.
82900N
).
International Society for Optics and Photonics
.
15.
Michael
Kazhdan
,
Thomas
Funkhouser
, and
Szymon
Rusinkiewicz
. “
Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors
.”
Symposium on Geometry Processing
, June
2003
.
16.
R. C.
Prim
.
Shortest connection networks and some generalizations
.
The Bell Systems Technical Journal
,
36
(
6
),
1957
.
17.
Monti
,
F.
,
Boscaini
,
D.
,
Masci
,
J.
,
Rodola
,
E.
,
Svoboda
,
J.
, &
Bronstein
,
M. M.
(
2017
).
Geometric deep learning on graphs and manifolds using mixture model cnns
. In
Proceedings of the IEEE conference on computer vision and pattern recognition
(pp.
5115
5124
).
18.
I.
Baran
and
J.
Popović
.
Automatic rigging and animation of 3d characters
.
ACM Trans. Graph.
,
26
(
3
),
2007
.
19.
Koenderink
,
J.
and
van Doorn
,
A.
, “
Surface shape and curvature scales
,"
Image and Vision Computing
, vol.
10
, pp.
557
564
,
1992
20.
Grimm
,
C.
,
Manifold mesh processing.
Available: http://sourceforge.net/projects/meshprocessing/
21.
Source code provided by Xu
et al. [1] https://github.com/zhan-xu/AnimSkelVolNet
22.
Automation source code for calculating Shape and Curve index
https://github.com/Bhaikko/AnimSkelVolNet
24.
Shape index and Curvature Library
: https://sourceforge.net/projects/meshprocessing/
25.
Boscaini
,
Davide
, et al. “
Learning shape correspondence with anisotropic convolutional neural networks
.”
Advances in neural information processing systems
29
(
2016
).
26.
Masci
,
Jonathan
, et al. “
Geodesic convolutional neural networks on riemannian manifolds
.”
Proceedings of the IEEE international conference on computer vision workshops.
2015
.
27.
Monti
,
F.
,
Boscaini
,
D.
,
Masci
,
J.
,
Rodola
,
E.
,
Svoboda
,
J.
, &
Bronstein
,
M. M.
(
2017
).
Geometric deep learning on graphs and manifolds using mixture model cnns
. In
Proceedings of the IEEE conference on computer vision and pattern recognition
(pp.
5115
5124
).
28.
Riegler
,
G.
,
Osman
Ulusoy
, A., &
Geiger
,
A.
(
2017
).
Octnet: Learning deep 3d representations at high resolutions
. In
Proceedings of the IEEE conference on computer vision and pattern recognition
(pp.
3577
3586
).
29.
Kazhdan
,
M.
,
Funkhouser
,
T.
, &
Rusinkiewicz
,
S.
(
2003
, June).
Rotation invariant spherical harmonic representation of 3 d shape descriptors
. In
Symposium on geometry processing
(Vol.
6
, pp.
156
164
).
30.
Jonathan J.
Tompson
,
Arjun
Jain
,
Yann
LeCun
, and
Christoph
Bregler
.
Joint training of a convolutional network and a graphical model for human pose estimation
. In
Advances in neural information processing systems
, pages
1799
1807
,
2014
.
31.
Kaiming
He
,
Xiangyu
Zhang
,
Shaoqing
Ren
, and
Jian
Sun
.
Deep residual learning for image recognition
. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
, June
2016
.
32.
Xiaokun
Wu
,
Daniel
Finnegan
,
Eamonn
O’Neill
, and
Yong-Liang
Yang
.
Handmap: Robust hand pose estimation via intermediate dense guidance map supervision
. In
Proceedings of the European Conference on Computer Vision (ECCV)
, pages
237
253
,
2018
.
33.
Xiao
Sun
,
Bin
Xiao
,
Fangyin
Wei
,
Shuang
Liang
, and
Yichen
Wei
.
Integral human pose regression. In Proceedings of the European Conference on Computer Vision (ECCV
), pages
529
545
,
2018
.
35.
Serim
Ryou
,
Seong-Gyun
Jeong
, and
Pietro
Perona
.
Anchor loss: Modulating loss scale based on prediction difficulty
. In
Proceedings of the IEEE International Conference on Computer Vision
, pages
5992
6001
,
2019
.
36.
Aaron
Kershenbaum
and
Richard
Van Slyke
.
Computing minimum spanning trees efficiently
. In
Proceedings of the ACM annual conference-
Volume
1
, pages
518
527
,
1972
.
38.
39.
The VG Resource
.
The model’s resource.
https://www.models-resource.com/,
2020
40.
Patrick Min. binvox 3d mesh voxelizer
. https://www.patrickmin.com/binvox/, February
2020
41.
Oscar Kin-Chung
Au
,
Chiew-Lan
Tai
,
Hung-Kuo
Chu
,
Daniel
Cohen-Or
, and
Tong-Yee
Lee
.
Skeleton extraction by mesh contraction
.
ACM transactions on graphics (TOG)
,
27
(
3
):
1
10
,
2008
.
42.
Philipp Schlegel. Skeletor
. https://github.com/schlegelp/skeletor, Oct
2020
.
43.
Patrick Min. binvox 3d mesh voxelizer
. https://www.patrickmin.com/binvox/, February
2020
.
44.
F.
Huang
,
A.
Zeng
,
M.
Liu
,
J.
Qin
, and
Q.
Xu
.
Structureaware 3d hourglass network for hand pose estimation from single depth image
.
In Proc. BMVC
,
2018
.
45.
G.
Moon
,
J. Y.
Chang
, and
K. M.
Lee
.
V2v-posenet: Voxel-to-voxel prediction network for accurate 3d hand and human pose estimation from a single depth map
.
In Proc. CVPR
,
2018
.
46.
Kevin P.
Murphy
. Machine learning: a probabilistic perspective.
MIT press
,
2012
.
47.
Stuart
Russel
,
Peter
Norvig
, et al.
Artificial intelligence: a modern approach
.
Pearson Education Limited
,
2013
.
48.
Max
Kuhn
,
Kjell
Johnson
, et al.
Applied predictive modeling
, volume
26
.
Springer
,
2013
.
49.
P.-S.
Wang
,
Y.
Liu
,
Y.-X.
Guo
,
C.-Y.
Sun
, and
X.
Tong
.
OCNN: Octree-based convolutional neural networks for 3D shape analysis
.
ACM Trans. Graph.
,
36
(
4
),
2017
.
50.
G.
Rogez
and
C.
Schmid
.
Mocap-guided data augmentation for 3d pose estimation in the wild
.
In Proc. NIPS
,
2016
.
51.
P.-S.
Wang
,
C.-Y.
Sun
,
Y.
Liu
, and
X.
Tong
.
Adaptive o-cnn: A patch-based deep representation of 3d shapes
.
ACM Trans. Graph.
,
37
(
6
),
2018
.
52.
L.
Liu
,
Y.
Zheng
,
D.
Tang
,
Y.
Yuan
,
C.
Fan
, and
K.
Zhou
.
Neuroskinning: Automatic skin binding for production characters with deep graph networks
.
ACM Trans. Graphics, to appear
,
2019
.
53.
Nicu D.
Cornea
,
Deborah
Silver
, and
Patrick
Min
.
Curve-skeleton properties, applications, and algorithms
.
IEEE Transactions on visualization and computer graphics
,
13
(
3
):
530
,
2007
.
This content is only available via PDF.
You do not currently have access to this content.