{"id":2962,"date":"2018-09-17T17:48:47","date_gmt":"2018-09-17T21:48:47","guid":{"rendered":"https:\/\/www.med.unc.edu\/bigs2\/?page_id=2962"},"modified":"2019-10-08T17:47:35","modified_gmt":"2019-10-08T21:47:35","slug":"clustering-shape-data","status":"publish","type":"page","link":"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/","title":{"rendered":"Clustering Shape Data"},"content":{"rendered":"<div id=\"content1\">\n<div id=\"main\">\n<div id=\"right\">\n<div id=\"right_text\">\n<div class=\"box\">\n<h4><strong>Title:\u00a0<\/strong><b>Clustering High-Dimensional Landmark-Based Two-Dimensional Shape Data<\/b><\/h4>\n<p>&nbsp;<\/p>\n<h4>Introduction<\/h4>\n<p>Shape analysis has been an important research topic with various applications in computer vision, object recognition,\u00a0and medical imaging for last several decades. An important goal in shape analysis is to classify and recognize objects of\u00a0interest according to the shapes of their boundaries. The majority of earlier work on shape analysis has focused on\u00a0landmark-based analysis, where shapes are represented by a coarse, discrete sampling of the object contours.<\/p>\n<p>Clustering landmark-based planar shape data raises four major challenges. First, planar shape data reside in a curved shape space,\u00a0which is invariant under a similarity transformation including rigid rotation and translation, and nonrigid uniform scaling.\u00a0Therefore, most clustering methods (e.g., K-means) proposed for Euclidean data cannot be used to cluster data in the curved\u00a0shape space. Second, it is a standard high-dimensional-low-sample-size problem, since shape dimension,\u00a0which is proportional to the number of landmark points, can be much larger than the sample size. Moreover, there may be\u00a0significant amounts of noise in many of the landmark points, which is either associated with the complexity of the studied\u00a0shapes or is caused by certain preprocessing steps such as image filtering and edge detection. Third, the landmark points\u00a0along the boundaries of objects are inherently and spatially correlated with each other. Fourth, shape variation is\u00a0commonly associated with some explanatory attributes (e.g., age, gender, or disease status).\u00a0Ignoring such complex spatial correlation and explanatory attributes can introduce substantial errors\u00a0in both clustering and classification results.<\/p>\n<h4>Questions<\/h4>\n<p>The aim of this project is to propose a mixture of offset-normal shape factor analyzer (MOSFA) model to address the four\u00a0challenges.<\/p>\n<h4>Methods<\/h4>\n<p>We use the offset-normal shape distribution to characterize the variability of shape data in the curved shape space.\u00a0To handle high dimensionality, we used a penalized clustering framework as an effective and powerful method to perform\u00a0both variable selection and\u00a0clustering. We integrated a latent factor analysis model to approximate the complex spatial correlation of shape data.\u00a0We used a logistic regression model to build an association between mixing proportions and covariates of interest.\u00a0We proposed an expectation-maximization (EM) algorithm and establish its convergence property.\u00a0We established the asymptotic properties of penalized estimator obtained from the EM algorithm.<\/p>\n<h4>Findings<\/h4>\n<p>We applied the MOSFA model to the ADHD-200 CC shape dataset. Our clustering results remarkably reveal an intrinsic subpopulation structure in the mixed population with controls and subjects with ADHD.\u00a0We calculated MPLE by using the EM algorithm and then the final MOSFA model was able to detect four clusters with 239, 98, 64,\u00a0and 246 subjects, respectively. The first three clusters contain 391 normal controls and 10 ADHD patients,\u00a0whereas the fourth cluster includes 13 normal controls and 233 ADHD patients. Thus, the first three clusters contain almost all\u00a0the normal controls, whereas most diseased subjects fall into the last cluster.\u00a0The mean shapes of the first three are similar to each other, whereas they are different from the mean shape of cluster four.<\/p>\n<p>We are also interested in the estimated loading matrices for all the four clusters. The estimated number of factors in\u00a0each cluster is 2. To extract the shape features of each cluster, we plotted each column in loading matrices for all\u00a0the clusters. The columns of the loading matrices from the first three clusters have similar tendency,\u00a0whereas they are different from those from the last cluster. It is consistent with the diagnosis information:\u00a0most normal controls are in the first three clusters, whereas most ADHD patients are in the last cluster.<\/p>\n<p>Then, we randomly chose subjects from each cluster and applied the ClosedCurves2D3D software (<a href=\"http:\/\/ssamg.stat.fsu.edu\/software\" target=\"new\" rel=\"noopener noreferrer\">ClosedCurves2D3D<\/a>)\u00a0to compute a pair-wise geodesic path among the four clusters under the elastic Riemannian metric.\u00a0It shows that the geodesic distance between subjects in the same cluster is smaller than that between subjects\u00a0in different clusters. Furthermore, the geodesic distance between subjects in the first three clusters is\u00a0much smaller than the geodesic distance between subjects in the first three clusters and those in the fourth cluster.<\/p>\n<p><img decoding=\"async\" class=\"center\" style=\"width: 804px;height: 428px\" src=\"https:\/\/www.med.unc.edu\/bigs2\/wp-content\/uploads\/sites\/822\/2018\/09\/project_shape_p1-fig1.jpeg\" \/><\/p>\n<p><img decoding=\"async\" class=\"center\" style=\"width: 804px;height: 428px\" src=\"https:\/\/www.med.unc.edu\/bigs2\/wp-content\/uploads\/sites\/822\/2018\/09\/project_shape_p1-fig2.jpeg\" \/><\/p>\n<p><img decoding=\"async\" class=\"center\" style=\"width: 804px;height: 428px\" src=\"https:\/\/www.med.unc.edu\/bigs2\/wp-content\/uploads\/sites\/822\/2018\/09\/project_shape_p1-fig3.jpeg\" \/><\/p>\n<p>&nbsp;<\/p>\n<h4>References<\/h4>\n<p>Huang, C., Styner, M., and Zhu, H.T. &#8220;Clustering High-Dimensional Landmark-based Two-dimensional Shape Data.&#8221;<br \/>\nJournal of the American Statistical Association, 110, 946-961, 2015.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"left\">\n<p><!--- need updated part --><\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<\/div>\n<div id=\"footer\">\n<p>&nbsp;<\/p>\n<\/div>\n<\/div>\n<p><!-- footer ends--><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Title:\u00a0Clustering High-Dimensional Landmark-Based Two-Dimensional Shape Data &nbsp; Introduction Shape analysis has been an important research topic with various applications in computer vision, object recognition,\u00a0and medical imaging for last several decades. An important goal in shape analysis is to classify and recognize objects of\u00a0interest according to the shapes of their boundaries. The majority of earlier work &hellip; <a href=\"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/\" aria-label=\"Read more about Clustering Shape Data\">Read more<\/a><\/p>\n","protected":false},"author":1503,"featured_media":0,"parent":2891,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_acf_changed":false,"footnotes":"","_links_to":"","_links_to_target":""},"class_list":["post-2962","page","type-page","status-publish","hentry","odd"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Clustering Shape Data - BIG-S2<\/title>\n<meta name=\"robots\" content=\"noindex, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Clustering Shape Data - BIG-S2\" \/>\n<meta property=\"og:description\" content=\"Title:\u00a0Clustering High-Dimensional Landmark-Based Two-Dimensional Shape Data &nbsp; Introduction Shape analysis has been an important research topic with various applications in computer vision, object recognition,\u00a0and medical imaging for last several decades. An important goal in shape analysis is to classify and recognize objects of\u00a0interest according to the shapes of their boundaries. The majority of earlier work &hellip; Read more\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/\" \/>\n<meta property=\"og:site_name\" content=\"BIG-S2\" \/>\n<meta property=\"article:modified_time\" content=\"2019-10-08T21:47:35+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.med.unc.edu\/bigs2\/wp-content\/uploads\/sites\/822\/2018\/09\/project_shape_p1-fig1.jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/\",\"url\":\"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/\",\"name\":\"Clustering Shape Data - BIG-S2\",\"isPartOf\":{\"@id\":\"https:\/\/www.med.unc.edu\/bigs2\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.med.unc.edu\/bigs2\/wp-content\/uploads\/sites\/822\/2018\/09\/project_shape_p1-fig1.jpeg\",\"datePublished\":\"2018-09-17T21:48:47+00:00\",\"dateModified\":\"2019-10-08T21:47:35+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/#primaryimage\",\"url\":\"https:\/\/www.med.unc.edu\/bigs2\/wp-content\/uploads\/sites\/822\/2018\/09\/project_shape_p1-fig1.jpeg\",\"contentUrl\":\"https:\/\/www.med.unc.edu\/bigs2\/wp-content\/uploads\/sites\/822\/2018\/09\/project_shape_p1-fig1.jpeg\",\"width\":934,\"height\":544},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.med.unc.edu\/bigs2\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Projects\",\"item\":\"https:\/\/www.med.unc.edu\/bigs2\/projects\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Shape Analysis\",\"item\":\"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/\"},{\"@type\":\"ListItem\",\"position\":4,\"name\":\"Clustering Shape Data\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.med.unc.edu\/bigs2\/#website\",\"url\":\"https:\/\/www.med.unc.edu\/bigs2\/\",\"name\":\"BIG-S2\",\"description\":\"Biostatistics and Imaging Genomics analysis lab - Statistics &amp; Signal\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.med.unc.edu\/bigs2\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Clustering Shape Data - BIG-S2","robots":{"index":"noindex","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"og_locale":"en_US","og_type":"article","og_title":"Clustering Shape Data - BIG-S2","og_description":"Title:\u00a0Clustering High-Dimensional Landmark-Based Two-Dimensional Shape Data &nbsp; Introduction Shape analysis has been an important research topic with various applications in computer vision, object recognition,\u00a0and medical imaging for last several decades. An important goal in shape analysis is to classify and recognize objects of\u00a0interest according to the shapes of their boundaries. The majority of earlier work &hellip; Read more","og_url":"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/","og_site_name":"BIG-S2","article_modified_time":"2019-10-08T21:47:35+00:00","og_image":[{"url":"https:\/\/www.med.unc.edu\/bigs2\/wp-content\/uploads\/sites\/822\/2018\/09\/project_shape_p1-fig1.jpeg","type":"","width":"","height":""}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/","url":"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/","name":"Clustering Shape Data - BIG-S2","isPartOf":{"@id":"https:\/\/www.med.unc.edu\/bigs2\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/#primaryimage"},"image":{"@id":"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/#primaryimage"},"thumbnailUrl":"https:\/\/www.med.unc.edu\/bigs2\/wp-content\/uploads\/sites\/822\/2018\/09\/project_shape_p1-fig1.jpeg","datePublished":"2018-09-17T21:48:47+00:00","dateModified":"2019-10-08T21:47:35+00:00","breadcrumb":{"@id":"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/#primaryimage","url":"https:\/\/www.med.unc.edu\/bigs2\/wp-content\/uploads\/sites\/822\/2018\/09\/project_shape_p1-fig1.jpeg","contentUrl":"https:\/\/www.med.unc.edu\/bigs2\/wp-content\/uploads\/sites\/822\/2018\/09\/project_shape_p1-fig1.jpeg","width":934,"height":544},{"@type":"BreadcrumbList","@id":"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/clustering-shape-data\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.med.unc.edu\/bigs2\/"},{"@type":"ListItem","position":2,"name":"Projects","item":"https:\/\/www.med.unc.edu\/bigs2\/projects\/"},{"@type":"ListItem","position":3,"name":"Shape Analysis","item":"https:\/\/www.med.unc.edu\/bigs2\/projects\/shape-analysis\/"},{"@type":"ListItem","position":4,"name":"Clustering Shape Data"}]},{"@type":"WebSite","@id":"https:\/\/www.med.unc.edu\/bigs2\/#website","url":"https:\/\/www.med.unc.edu\/bigs2\/","name":"BIG-S2","description":"Biostatistics and Imaging Genomics analysis lab - Statistics &amp; Signal","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.med.unc.edu\/bigs2\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links_to":[],"_links_to_target":[],"_links":{"self":[{"href":"https:\/\/www.med.unc.edu\/bigs2\/wp-json\/wp\/v2\/pages\/2962","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.med.unc.edu\/bigs2\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.med.unc.edu\/bigs2\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.med.unc.edu\/bigs2\/wp-json\/wp\/v2\/users\/1503"}],"replies":[{"embeddable":true,"href":"https:\/\/www.med.unc.edu\/bigs2\/wp-json\/wp\/v2\/comments?post=2962"}],"version-history":[{"count":0,"href":"https:\/\/www.med.unc.edu\/bigs2\/wp-json\/wp\/v2\/pages\/2962\/revisions"}],"up":[{"embeddable":true,"href":"https:\/\/www.med.unc.edu\/bigs2\/wp-json\/wp\/v2\/pages\/2891"}],"wp:attachment":[{"href":"https:\/\/www.med.unc.edu\/bigs2\/wp-json\/wp\/v2\/media?parent=2962"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}