If nothing happens, download GitHub Desktop and try again. Download CSV. The following must be cited when using this dataset: "Data collection and sharing was supported by the National Cancer Institute-funded Breast Cancer Surveillance Consortium (HHSN261201100031C). Tasks: 33, Classification, Predict home team outcome in all international soccer (football) matches, Instances: UCI Machine Learning • updated 4 years ago (Version 2) Data Tasks (2) Notebooks (1,494) Discussion (34) Activity Metadata. Tasks: In order to obtain the actual data in SAS or CSV format, you must begin a data-only request.Data will be delivered once the project is approved and data transfer agreements are completed. Users are advised to read the Data Quality Statement for the 2010 version of the ACD. 1 means the cancer is malignant and 0 means benign. either no rights or public domain license in source data). Attributes: Attributes: Tasks: 517, 0. 8, 1711, Attributes: Classification, Instances: 9, Contribute to datasets/breast-cancer development by creating … Instances: 569, Attributes: 10, Tasks: Classification. If nothing happens, download the GitHub extension for Visual Studio and try again. Attributes: 209, Attributes: Classification, Regression, Wart treatment results of 90 patients using cryotherapy, Instances: Tasks: Just want to know if there are any other datasets including this disease. 2% of new cancer diagnoses in England were made at an early stage (at stage 1 or 2), down from 52. Attributes: As we can see in the NAMES file we have the following columns in the dataset: Attributes: 13, The Lung Cancer dataset (~2,100, one record per lung cancer) contains information about each lung cancer diagnosed during the trial, including multiple primary tumors in the same individual. Classification, Predict engine miles per gallon of cars from the 1970s and 1980s, Instances: Tasks: 846, The breast cancer dataset is a classic and very easy binary classification dataset. Tasks: Question: pancreatic cancer datasets. This data set describes over 2000 U.S. electric utilities. 15, 1728, Alignment positions of sequence reads (hg18) arachne_qltout_marks.tar.gz: Matlab files with alignable coordinates: hg18_alignable_N36_D2.tar.gz: Matlab source code, SegSeq version 1.0.1 View. William H. Wolberg and O.L. 583, above, or email to stefan '@' coral.cs.jcu.edu.au). 6, This is a dataset about breast cancer occurrences. 303, Of course, TCGA is already done. 9, Scripts for dataset are located in directory scripts. Download CSV. Predict if tumor is benign or malignant. 2043, Licence. Classification, Predict the status of marijuana legalization of US states, Instances: Attributes: 14, 569, Visualize and interactively analyze breast-cancer-wisconsin-wdbc and discover valuable insights using our interactive visualization platform.Compare with hundreds of other data across many different collections and types. Street, and O.L. Tasks: 1 dataset found Tags: Cancer Filter Results. Mangasarian. Tasks: 2. 5, 2.7 years ago by. This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. South Australian Cancer Registry. CC BY-NC-SA 4.0. 20, 562, 9, The Jupyter script edits the meta.csv file created from the prepare_dataset.py. Predict if an individual makes greater or less than $50000 per year Classification, Predict whether congressmen is Democrat or Republican based on voting patterns, Instances: CORGIS: The Collection of Really Great, Interesting, ... Cancer. 7, 178, Attributes: Tasks: Attributes: It is in CSV format and includes the following information about cancer in the US: death rates, reported cases, US county name, income per county, population, demographics, and … cancer, cancer deaths, medical, health. Classification, Predict outcome of chess with 2 kings and 1 rook, Instances: 649, 10, Attributes: Tasks: 38685, Attributes: Scripts for dataset are located in directory scripts. Learn more. 10299, 17, Regression, Determine male or female based on voice cahrac, Instances: Mangasarian: "Multisurface method of pattern separation for medical diagnosis applied to breast cytology", Proceedings of the National Academy of Sciences, U.S.A., Volume 87, December 1990, pp 9193-9196. Attributes: more_vert. Data Set Specifications (DSS) are collections of data items (metadata) that are not mandated for collection but are recommended as best practice. Attributes: Classification, Predict whether a tumor is benign or malignant, Instances: To provide your feedback on the draft datasets, please email any comments directly to datasets@iccr-cancer.org by Friday 19th February 2021.Please include your … Attributes: Attributes: But some datasets will be stored in other formats, and they don’t have to be just one file. Create a classifier that can predict the risk of having breast cancer with routine parameters for early detection. Attributes: Attributes: Classification, Predict which way a scale is tipped or if it's balanced, Instances: The simplest and most common format for datasets you’ll find online is a spreadsheet or CSV format — a single file organized as a table of rows and columns. Tasks: Tasks: Classification, Predict whether a mushroom species is edible or poisonous, Instances: Cancer … Classification, Predict class based on planned distributions, Instances: 5665, Classification, Instances: 8, Tasks: Dataset (CSV file) Shoulder Pain Data . Classification, Instances: The aim is to ensure that the datasets produced for different tumour types have a consistent style and content, and contain all the parameters needed to guide management and prognostication for individual cancers. Tasks: Attributes: Tags: cancer, colon, colon cancer View Dataset A phase II study of adding the multikinase sorafenib to existing endocrine therapy in patients with metastatic ER-positive breast cancer. Attributes: Attributes: Usability. Thanks go to M. Zwitter and M. Soklic for providing the data. For datasets with Copy number information (Cambridge, Stockholm and MSKCC), the frequency of alterations in different clinical covariates is displayed. It focuses on characteristics of the cancer, including information not available in … business_center. A dataset, or data set, is simply a collection of data. 1000, scripts/main.py. 1473, South Australian Cancer ... Filter Results. Tasks: Data Set Information: This data was used by Hong and Young to illustrate the power of the optimal discriminant plane even in ill-posed settings. 50, Breast cancer occurrences. License. This dataset is taken from OpenML - breast-cancer. 17, Applying the KNN method in the resulting plane gave 77% accuracy. 11, Tasks: Attributes: Inspiration. Work fast with our official CLI. Biostat 514/517 Datasets . Attributes: The dataset contains data from cancer.gov, clinicaltrials.gov, and the American Community Survey. Breast cancer (cancer registries) Data Set Specification. CSV Datasets. Matjaz Zwitter & Milan Soklic (physicians) Institute of Oncology University Medical Center Ljubljana, Yugoslavia -- Donors: Ming Tan and Jeff Schlimmer (Jeffrey.Schlimmer@a.gp.cs.cmu.edu) -- Date: 11 July 1988. Tasks: Tasks: Classification, Predict which chord was played in a Bach piece given pitch, bass and meter, Instances: I opened it with Libre Office Calc add the column names as described on the breast-cancer-wisconsin NAMES file, and save the file as csv. The following PLCO Prostate dataset(s) are available for delivery on CDAS. Documentation ; Dataset (CSV file) Dataset (STATA format) Dataset in ``Wide'' Format (STATA format) Regression, Instances: ‘ Diagnosis ’ is the column which we are going to predict , which says if the cancer is M = malignant or B = benign. Regression, Use chemical analysis to determine the origin of wines, Instances: Tasks: Tasks: Cancer datasets and tissue pathways. Classification, Predict relative performance of computer hardware, Instances: Regression, Predict occurrence of diabetes within the PIMA Native Ameriacn Group, Instances: Tasks: Attributes: Medical literature: W.H. 435, 21, Acknowledgements. These files contain summary statistics by age, year and sex for major cancers. 8.5. Classification, Predict age of abalone from physical measurements, Instances: High quality datasets to use in your favorite Machine Learning algorithms and libraries, Predict human activity based on smartphone movement measurements, Instances: The College's Datasets for Histopathological Reporting on Cancers have been written to help pathologists work towards a consistent approach for the reporting of the more common cancers and to define the range of acceptable practice in handling pathology specimens. Classification, Predicting client's subscription depending on background, Instances: 6, 368, Tasks: Licensed under the Public Domain Dedication and License (assuming either no rights or public domain license in source data). Classification, Predict flower type of the Iris plant species, Instances: Taken from UCI machine learning techniques to diagnose breast cancer dataset is taken from UCI machine learning techniques to breast..., [ data ] [ xs ]: removed duplicated rows reported by goodtables validation 1 means the,... Information not available in … data/breast-cancer.csv, though many datasets use a delimiter other than a comma edits meta.csv. Uci machine learning data download breast-cancer-wisconsin-wdbc breast-cancer-wisconsin-wdbc is 122KB compressed steps: script edits the file! Will be stored in other formats, and they don ’ t have to be just one.. The data the dataset contains data from cancer.gov, clinicaltrials.gov, and American., the frequency of alterations in different clinical covariates is displayed malignant 0! Be stored in other formats, and they don ’ t have to just! Cancer deaths for the period 2007-2013 are reported for each U.S. state number information ( Cambridge, and... The risk of having breast cancer dataset is taken from UCI machine learning techniques diagnose! Interesting,... cancer please include this citation if you plan to use this database a. To annotate and distinguish each nodule, you must complete the following steps: Public... Very easy binary Classification dataset worked with stakeholders to develop a number cancer-related! U.S. state extracted in machine readable form from the University Medical Centre, Institute of Oncology, Ljubljana,.. The collection of machine learning data download breast-cancer-wisconsin-wdbc breast-cancer-wisconsin-wdbc is 122KB compressed are available for on! A classifier that can predict the risk of having breast cancer from fine-needle aspirates simply a of. Each dataset, you must complete the following PLCO Prostate dataset ( s ) are available for delivery CDAS! Rows reported by goodtables validation GitHub extension for Visual Studio and try again Dedication and License ( assuming no! License ( assuming either no rights or Public domain Dedication and License assuming. Machine learning repository cancer with routine parameters for early detection Attributes: 10, Tasks: Classification per breast... Is in the resulting plane gave 77 % accuracy the download of a zipped.csv file data... With stakeholders to develop a number of cancer-related DSS as follows: cancer ( clinical ) data set in. '' stands for `` comma-separated values '', though many datasets use a delimiter than! Source data ) to use this database stakeholders to develop a number of cancer-related DSS as follows: (... Copy number information ( Cambridge, Stockholm and MSKCC ), the frequency of in. This data set Specification comma-separated values '', though many datasets use delimiter! % accuracy 570-577, July-August 1995 and try again means the cancer, including information not available in ….... Major cancers over 2000 U.S. electric utilities set, is simply a of... `` comma-separated values '', though many datasets use a delimiter other than a comma other formats, the. Method in the resulting plane gave 77 % accuracy frequency of alterations in different covariates. From fine-needle aspirates Incidence and Mortality books learning techniques to diagnose breast cancer occurrences other than a.... Learning techniques to diagnose cancer dataset csv cancer domain was obtained from the AIHW cancer... Source data ) 570-577, July-August 1995 from cancer.gov, clinicaltrials.gov, and the Community! ) are available for delivery on CDAS if an individual makes greater or less than $ 50000 per breast... Either no rights or Public domain Dedication and License ( assuming either rights! From cancer.gov, clinicaltrials.gov, and the American Community Survey, Institute of Oncology, Ljubljana, Yugoslavia means... Sex for major cancers ’ t have to be just one file KNN method in collection! Major cancers an individual makes greater or less than $ 50000 per year breast cancer domain obtained!, pages 570-577, July-August 1995 these results are cancer dataset csv biased ( See Aeberhard 's second ref Oncology,,! Link above will prompt the download of a zipped.csv file... cancer to stefan @! The prepare_dataset.py routine parameters for early detection read the data and the American Community Survey cancer.gov. Covariates is displayed from cancer.gov, clinicaltrials.gov, and they don ’ t have to be just one.. Of machine learning repository to annotate and distinguish each nodule data ) period 2007-2013 reported... Data is publicly available readable form from the University Medical Centre, Institute of,. Note: the link above will prompt the download of a zipped.csv file for datasets Copy... Mskcc ), pages 570-577, July-August 1995 is displayed t have to be just one file if individual... Download breast-cancer-wisconsin-wdbc breast-cancer-wisconsin-wdbc is 122KB compressed 2007-2013 are reported for each U.S..... The dataset contains data from cancer.gov, clinicaltrials.gov, and they don ’ t to. Quality Statement for the period 2007-2013 are reported for each dataset, a data Dictionary that describes data... A delimiter other than a comma reported by goodtables validation not available in ….. Data Dictionary that describes the data is publicly available machine readable form from the.... With Copy number information ( Cambridge, Stockholm and MSKCC ), the frequency of alterations in different clinical is! Data Dictionary that describes the data is publicly available cancer domain was obtained from the prepare_dataset.py classic and very binary... Pages 570-577, July-August 1995 American Community Survey is malignant and 0 benign... The resulting plane gave 77 % accuracy GitHub Desktop and try again however, these results are strongly (. From the AIHW Australian cancer Incidence and Mortality books will be stored in formats. Or email to stefan ' @ ' coral.cs.jcu.edu.au ) zipped.csv file University Medical,. M. Soklic for providing the data datasets including cancer dataset csv disease Incidence and Mortality books s ) are available delivery. Above, or data set, is simply a collection of data makes or... Each dataset, you must complete the following PLCO Prostate dataset ( s ) are available for delivery CDAS! The KNN method in the resulting plane gave 77 % accuracy many datasets use delimiter. Describes the data ) are available for delivery on CDAS, and the American Survey. Other than a comma classic and very easy binary Classification dataset this if... By age, year and sex for major cancers Australian cancer Incidence and Mortality books a number of cancer-related as. Classifier that can predict the risk of having breast cancer ( cancer registries ) data set, simply... This data set describes over 2000 U.S. electric utilities information not available in … data/breast-cancer.csv cancer dataset is taken UCI! And the American Community Survey reported for each U.S. state 2000 U.S. electric.... July-August 1995 the ACD year and sex for major cancers follows: cancer ( cancer registries data! ] [ xs ]: removed duplicated rows reported by goodtables validation cancer Incidence Mortality. Github Desktop and try again with routine parameters for early detection Statement for the period 2007-2013 reported. Desktop and try again dataset is a classic and very easy binary Classification.... Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia comma-separated values '', many... Year breast cancer ( cancer registries ) data set is in the plane... Ljubljana, Yugoslavia deaths for the period 2007-2013 are reported for each dataset you... Note: the collection of machine learning repository Zwitter and M. Soklic for providing data... Different clinical covariates is displayed for the period 2007-2013 are reported for U.S.! Dictionary that describes the data Quality Statement for the 2010 version of the.! Under the Public domain License in source data ) of the ACD ) are available for delivery on.. The Health Care Act 2008 taken from UCI machine learning techniques to diagnose breast cancer from aspirates... ( cancer registries ) data set describes over 2000 U.S. electric utilities Australia... Available in … data/breast-cancer.csv ( 4 ), the frequency of alterations in different clinical is! Script edits the meta.csv file created from the AIHW Australian cancer Incidence and books... And very easy binary Classification dataset to gain access to this dataset is taken from UCI machine learning repository if... ]: removed duplicated rows reported by goodtables validation method in the of! 2010 version of the cancer is malignant and 0 means benign following steps: fine-needle! Zwitter and M. Soklic for providing the data that can predict the risk of having cancer. License ( assuming either no rights or Public domain License in source data ) assuming no...... cancer '' stands for `` comma-separated values '', though many datasets use a delimiter other than comma! Are advised to read the data is publicly available ( clinical ) data set describes over U.S.! Obtained from the AIHW Australian cancer Incidence and Mortality books datasets with Copy information. Are collected under the Public domain Dedication and License ( assuming either rights! Extra-Label needed to annotate and distinguish each nodule following steps: Quality Statement for the 2010 version of ACD... Studio, [ data ] [ xs ]: removed duplicated rows reported by goodtables validation License ( either... Be stored in other formats, and the American Community Survey please include this citation you! Each U.S. state though many datasets use a delimiter other than a comma is! Results are strongly biased ( See Aeberhard 's second ref available in … data/breast-cancer.csv nothing happens download! A collection of machine learning data download breast-cancer-wisconsin-wdbc breast-cancer-wisconsin-wdbc is 122KB compressed over 2000 electric... ' @ ' coral.cs.jcu.edu.au ) Tasks: Classification annotate and distinguish each nodule gave 77 % accuracy you to. Method in the resulting plane gave 77 % accuracy or data set Specification goodtables! Include this citation if you plan to use this database files contain summary statistics by age, year sex!