Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
Xavier University Library has a wide variety of electronic books, videos, and tutorials on data curation, management, and visualization. All of the resources below are available to members of the Xavier community free of charge. Contact your Personal Librarian with any questions about accessing these resources.
Information about accessing Safari Books Online can be found here.
Safari Books Online
Business Analytics with R: First Steps
To gain a competitive advantage, businesses need to implement data-driven strategies. Discerning which type of analysis can be a challenge with so many data types and models available. Pierre will cover the key frameworks to drive advanced analysis of business data.
Data Science from the Shell
The command line is a great environment for inspecting a dataset, automating data science tasks, and more. Hit the ground running with these picks from Jeroen Janssens, author of "Data Science at the Command Line".
Essential Math for Data Science
Learn fundamental linear algebra, calculus, probability, and statistics using Python—vital skills for data science—with resources from Hadrien Jean.
Getting Started with Data Visualization
Learn how to turn your complex datasets into clear and compelling visuals, with these picks from Claus O. Wilke.
These picks from Tableau Zen Master Ryan Sleeper explore Tableau's most powerful technical features, how to make useful charts, and more.
Programming for Data Science: Beginner to Intermediate
Jesse Anderson points out the best programming resources if you're a data science professional who may not come from a computer science or coding background.
SQL for Data Science: Advanced
SQL proficiency is a must for interacting with data. These picks from Robert de Graaf will help you move beyond the fundamentals to advanced concepts.
Storytelling with Data by
Publication Date: 2015-11-02
Don't simply show your data--tell a story with it! Storytelling with Data teaches you the fundamentals of data visualization and how to communicate effectively with data. You'll discover the power of storytelling and the way to make data a pivotal point in your story. The lessons in this illuminative text are grounded in theory, but made accessible through numerous real-world examples--ready for immediate application to your next graph or presentation. Storytelling is not an inherent skill, especially when it comes to data visualization, and the tools at our disposal don't make it any easier. This book demonstrates how to go beyond conventional tools to reach the root of your data, and how to use your data to create an engaging, informative, compelling story. Specifically, you'll learn how to: Understand the importance of context and audience Determine the appropriate type of graph for your situation Recognize and eliminate the clutter clouding your information Direct your audience's attention to the most important parts of your data Think like a designer and utilize concepts of design in data visualization Leverage the power of storytelling to help your message resonate with your audience Together, the lessons in this book will help you turn your data into high impact visual stories that stick with your audience. Rid your world of ineffective graphs, one exploding 3D pie chart at a time. There is a story in your data--Storytelling with Data will give you the skills and power to tell it!
The Truthful Art: Data, Charts and Maps for Communication by
Publication Date: 2016-02-18
Every day, at work, home, and school, we are bombarded with vast amounts of free data collected and shared by everyone and everything from our co-workers to our calorie counters. In this highly anticipated follow-up to The Functional Art--Alberto Cairo's foundational guide to understanding information graphics and visualisation--the respected data visualisation professor explains in clear terms how to work with data, discover the stories hidden within, and share those stories with the world in the form of charts, maps, and infographics. In The Truthful Art, Cairo transforms elementary principles of data and scientific reasoning into tools that you can use in daily life to interpret data sets and extract stories from them. The Truthful Art explains: The role infographics and data visualisation play in our world Basic principles of data and scientific reasoning that anyone can master How to become a better critical thinker Step-by-step processes that will help you evaluate any data visualisation (including your own) How to create and use effective charts, graphs, and data maps to explain data to any audience
GIS Fundamentals by
Publication Date: 2013-09-25
With GIS technology increasingly available to a wider audience on devices from apps on smartphones to satnavs in cars, many people routinely use spatial data in a way which used to be the preserve of GIS specialists. However spatial data is stored and analyzed on a computer still tends to be described in academic texts and articles which require specialist knowledge or some training in computer science. Developed to introduce computer science literature to geography students, GIS Fundamentals, Second Edition provides an accessible examination of the underlying principles for anyone with no formal training in computer science. See What's New in the Second Edition: Coverage of the use of spatial data on the Internet Chapters on databases and on searching large databases for spatial queries Improved coverage on route-finding Improved coverage of heuristic approaches to solving real-world spatial problems International standards for spatial data The book begins with a brief but detailed introduction to how computers work and how they are programmed, giving anyone with no previous computer science background a foundation to understand the remainder of the book. As with all parts of the book there are also suggestions for further sources of reading. The book then describes the ways in which vector and raster data can be stored and how algorithms are designed to perform fundamental operations such as detecting where lines intersect. From these simple beginnings the book moves into the more complex structures used for handling surfaces and networks and contains a detailed account of what it takes to determine the shortest route between two places on a network. The final sections of the book review problems, such as the "Travelling Salesman" problem, which are so complex that it is not known whether an optimum solution exists. Using clear, concise language, but without sacrificing technical rigour, the book gives readers an understanding of what it takes to produce systems which allow them to find out where to make their next purchase and how to drive to the right place to collect it.
Hands-On Geospatial Analysis with R and QGIS by
Publication Date: 2018-11-30
Practical examples with real-world projects in GIS, Remote sensing, Geospatial data management and Analysis using the R programming language Key Features Understand the basics of R and QGIS to work with GIS and remote sensing data Learn to manage, manipulate, and analyze spatial data using R and QGIS Apply machine learning algorithms to geospatial data using R and QGIS Book Description Managing spatial data has always been challenging and it's getting more complex as the size of data increases. Spatial data is actually big data and you need different tools and techniques to work your way around to model and create different workflows. R and QGIS have powerful features that can make this job easier. This book is your companion for applying machine learning algorithms on GIS and remote sensing data. You'll start by gaining an understanding of the nature of spatial data and installing R and QGIS. Then, you'll learn how to use different R packages to import, export, and visualize data, before doing the same in QGIS. Screenshots are included to ease your understanding. Moving on, you'll learn about different aspects of managing and analyzing spatial data, before diving into advanced topics. You'll create powerful data visualizations using ggplot2, ggmap, raster, and other packages of R. You'll learn how to use QGIS 3.2.2 to visualize and manage (create, edit, and format) spatial data. Different types of spatial analysis are also covered using R. Finally, you'll work with landslide data from Bangladesh to create a landslide susceptibility map using different machine learning algorithms. By reading this book, you'll transition from being a beginner to an intermediate user of GIS and remote sensing data in no time. What you will learn Install R and QGIS Get familiar with the basics of R programming and QGIS Visualize quantitative and qualitative data to create maps Find out the basics of raster data and how to use them in R and QGIS Perform geoprocessing tasks and automate them using the graphical modeler of QGIS Apply different machine learning algorithms on satellite data for landslide susceptibility mapping and prediction Who this book is for This book is great for geographers, environmental scientists, statisticians, and every professional who deals with spatial data. If you want to learn how to handle GIS and remote sensing data, then this book is for you. Basic knowledge of R and QGIS would be helpful but is not necessary.
Practical GIS by
Publication Date: 2017-06-13
Learn the basics of Geographic Information Systems by solving real-world problems with powerful open source toolsAbout This Book* This easy-to-follow guide allows you to manage and analyze geographic data with ease using open source tools* Publish your geographical data online* Learn the basics of geoinformatics in a practical way by solving problemsWho This Book Is ForThe book is for IT professionals who have little or no knowledge of GIS. It's also useful for those who are new to the GIS field who don't want to spend a lot of money buying licenses of commercial tools and training.What You Will Learn* Collect GIS data for your needs* Store the data in a PostGIS database* Exploit the data using the power of the GIS queries* Analyze the data with basic and more advanced GIS tools* Publish your data and share it with others* Build a web map with your published dataIn DetailThe most commonly used GIS tools automate tasks that were historically done manually--compiling new maps by overlaying one on top of the other or physically cutting maps into pieces representing specific study areas, changing their projection, and getting meaningful results from the various layers by applying mathematical functions and operations. This book is an easy-to-follow guide to use the most matured open source GIS tools for these tasks.We'll start by setting up the environment for the tools we use in the book. Then you will learn how to work with QGIS in order to generate useful spatial data. You will get to know the basics of queries, data management, and geoprocessing.After that, you will start to practice your knowledge on real-world examples. We will solve various types of geospatial analyses with various methods. We will start with basic GIS problems by imitating the work of an enthusiastic real estate agent, and continue with more advanced, but typical tasks by solving a decision problem.Finally, you will find out how to publish your data (and results) on the web. We will publish our data with QGIS Server and GeoServer, and create a basic web map with the API of the lightweight Leaflet web mapping library.Style and approachThe book guides you step by step through each of the core concepts of the GIS toolkit, building an overall picture of its capabilities. This guide approaches the topic systematically, allowing you to build upon what you learned in previous chapters. By the end of this book, you'll have an understanding of the aspects of building a GIS system and will be able to take that knowledge with you to whatever project calls for it.
Essential Statistics for Non-STEM Data Analysts by
Publication Date: 2020-11-12
Reinforce your understanding of data science and data analysis from a statistical perspective to extract meaningful insights from your data using Python programming Key Features Work your way through the entire data analysis pipeline with statistics concerns in mind to make reasonable decisions Understand how various data science algorithms function Build a solid foundation in statistics for data science and machine learning using Python-based examples Book Description Statistics remain the backbone of modern analysis tasks, helping you to interpret the results produced by data science pipelines. This book is a detailed guide covering the math and various statistical methods required for undertaking data science tasks. The book starts by showing you how to preprocess data and inspect distributions and correlations from a statistical perspective. You'll then get to grips with the fundamentals of statistical analysis and apply its concepts to real-world datasets. As you advance, you'll find out how statistical concepts emerge from different stages of data science pipelines, understand the summary of datasets in the language of statistics, and use it to build a solid foundation for robust data products such as explanatory models and predictive models. Once you've uncovered the working mechanism of data science algorithms, you'll cover essential concepts for efficient data collection, cleaning, mining, visualization, and analysis. Finally, you'll implement statistical methods in key machine learning tasks such as classification, regression, tree-based methods, and ensemble learning. By the end of this Essential Statistics for Non-STEM Data Analysts book, you'll have learned how to build and present a self-contained, statistics-backed data product to meet your business goals. What you will learn Find out how to grab and load data into an analysis environment Perform descriptive analysis to extract meaningful summaries from data Discover probability, parameter estimation, hypothesis tests, and experiment design best practices Get to grips with resampling and bootstrapping in Python Delve into statistical tests with variance analysis, time series analysis, and A/B test examples Understand the statistics behind popular machine learning algorithms Answer questions on statistics for data scientist interviews Who this book is for This book is an entry-level guide for data science enthusiasts, data analysts, and anyone starting out in the field of data science and looking to learn the essential statistical concepts with the help of simple explanations and examples. If you're a developer or student with a non-mathematical background, you'll find this book useful. Working knowledge of the Python programming language is required.
Practical Statistics for Data Scientists by
Publication Date: 2020-06-02
Statistical methods are a key part of data science, yet few data scientists have formal statistical training. Courses and books on basic statistics rarely cover the topic from a data science perspective. The second edition of this popular guide adds comprehensive examples in Python, provides practical guidance on applying statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not. Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R or Python programming languages and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format. With this book, you'll learn: Why exploratory data analysis is a key preliminary step in data science How random sampling can reduce bias and yield a higher-quality dataset, even with big data How the principles of experimental design yield definitive answers to questions How to use regression to estimate outcomes and detect anomalies Key classification techniques for predicting which categories a record belongs to Statistical machine learning methods that "learn" from data Unsupervised learning methods for extracting meaning from unlabeled data
SPSS Statistics for Dummies by
Publication Date: 2020-09-09
The fun and friendly guide to mastering IBM's Statistical Package for the Social Sciences Written by an author team with a combined 55 years of experience using SPSS, this updated guide takes the guesswork out of the subject and helps you get the most out of using the leader in predictive analysis. Covering the latest release and updates to SPSS 27.0, and including more than 150 pages of basic statistical theory, it helps you understand the mechanics behind the calculations, perform predictive analysis, produce informative graphs, and more. You'll even dabble in programming as you expand SPSS functionality to suit your specific needs. Master the fundamental mechanics of SPSS Learn how to get data into and out of the program Graph and analyze your data more accurately and efficiently Program SPSS with Command Syntax Get ready to start handling data like a pro--with step-by-step instruction and expert advice!
MIT Press Direct
Visualization and Interpretation by
Publication Date: 2020-11-10
In the several decades since humanists have taken up computational tools, they have borrowed many techniques from other fields, including visualization methods to create charts, graphs, diagrams, maps, and other graphic displays of information. But are these visualizations actually adequate for the interpretive approach that distinguishes much of the work in the humanities? Information visualization, as practiced today, lacks the interpretive frameworks required for humanities-oriented methodologies. In this book, Johanna Drucker continues her interrogation of visual epistemology in the digital humanities, reorienting the creation of digital tools within humanities contexts. Drucker examines various theoretical understandings of visual images and their relation to knowledge and how the specifics of the graphical are to be engaged directly as a primary means of knowledge production for digital humanities. She draws on work from aesthetics, critical theory, and formal study of graphical systems, addressing them within the specific framework of computational and digital activity as they apply to digital humanities. Finally, she presents a series of standard problems in visualization for the humanities (including time/temporality, space/spatial relations, and data analysis), posing the investigation in terms of innovative graphical systems informed by probabilistic critical hermeneutics. She concludes with a final brief sketch of discovery tools as an additional interface into which modeling can be worked.
Data Action by
Publication Date: 2020-12-08
Big data can be used for good-from tracking disease to exposing human rights violations-and for bad-implementing surveillance and control. Data inevitably represents the ideologies of those who control its use; data analytics and algorithms too often exclude women, the poor, and ethnic groups. In Data Action, Sarah Williams provides a guide for working with data in more ethical and responsible ways. Too often data has been used-and manipulated-to make policy decisions without much stakeholder input. Williams outlines a method that emphasizes collaboration among data scientists, policy experts, data designers, and the public. This approach creates trust and co-ownership in the data by opening the process to those who know the issues best.
All Data Are Local by
Publication Date: 2019-04-30
How to analyze data settings rather than data sets, acknowledging the meaning-making power of the local. In our data-driven society, it is too easy to assume the transparency of data. Instead, Yanni Loukissas argues in All Data Are Local, we should approach data sets with an awareness that data are created by humans and their dutiful machines, at a time, in a place, with the instruments at hand, for audiences that are conditioned to receive them. The term data set implies something discrete, complete, and portable, but it is none of those things. Examining a series of data sources important for understanding the state of public life in the United States--Harvard's Arnold Arboretum, the Digital Public Library of America, UCLA's Television News Archive, and the real estate marketplace Zillow--Loukissas shows us how to analyze data settings rather than data sets. Loukissas sets out six principles: all data are local; data have complex attachments to place; data are collected from heterogeneous sources; data and algorithms are inextricably entangled; interfaces recontextualize data; and data are indexes to local knowledge. He then provides a set of practical guidelines to follow. To make his argument, Loukissas employs a combination of qualitative research on data cultures and exploratory data visualizations. Rebutting the "myth of digital universalism," Loukissas reminds us of the meaning-making power of the local.
Databases with Downloadable Datasets
Datasets cover a wide range of subjects – including business, finance, banking, economics, sociology, political science, demography, agriculture, education, international studies, criminal justice, housing and construction, labor and employment, energy resources and industries, and more. Sources include public, private/commercial, and nongovernmental organizations, as well as Woods & Poole Economics, Inc.
DemographicsNow (Gale Business)
From population and Census data to housing expenditures, users of DemographicsNow will have access to demographic information including income, housing, race, age, education, retail spending, consumer expenditures, businesses and more.
Historical Statistics of the United States
From 1800s-1990s; full-text essays & introductions; downloadable data
This database provides quantitative data on population, work & welfare, economic structure & performance, economic sectors, and governance & international relations. It covers social, behavioral, humanistic, and natural sciences including history, economics, government, finance, sociology, demography, education, law, natural resources, climate, religion, international migration, and trade. It permits users to search and download data as well as create customized tables, spreadsheets and graphs.
All Databases with Data/Statistics