The number of different data files is on the rise in our increasingly digital society. Big data has infiltrated all aspects of society, from health care to the economy and from education to science, as all sectors are confronted with it. This involves not just the creation of new data, but also new combinations of existing data. Infrastructure and processing techniques are also being adapted. Society needs to be able to deal with all its data. Computing power, algorithms, analyses and software are all needed to analyse these huge data files and implement the results.

 

Researchers introduce themselves:

Big data

  •  

    Numerous entities in cities generate data. All those buildings, infrastructures, production processes and even human relationships can be monitored with the help of data science. This monitoring, in turn, can be the basis for predicting future behaviour of the entities and making decisions about them. At the Leiden Institute of Advanced Computer Science (LIACS), data scientists are collaborating in various projects that support our society.

     

    Bridge behaviour

    Bridges, roads and tunnels are built to endure harsh conditions, such as heavy duty traffic or extreme weather conditions. How does the quality of such infrastructural assets evolve over time and how does this influence their safety? The InfraWatch project determined this for the Hollandse Brug, by equipping it with 145 sensors and monitoring the bridge over a six year time span. Analysis of the data collected led to interesting new insights that can also be used to predict the state and adjust the maintenance of comparable bridges.

     

    Sewer pipes defects

    The roughly 100.000 km of sewer pipes present in the Netherlands are essential to our national health and safety. In the SewerSense project, sanitary engineers of TU Delft and computer scientists of Leiden University are combining data from different sensors, and automating and improving the inspection process of the sewer system with the help of robots and modern camera techniques. This will improve the precision of the measurements, and increase the probability of finding specific defects without the intervention of human inspectors.

     

    Playground behaviour & health monitoring

    Not only infrastructures can be monitored. To get insight into toddlers' playground behaviour we attached sensors to the children's clothes and followed their interactions. This resulted in new conclusions about playground behaviour and the importance of physical play for the socio-emotional development of children. Another example is a sensor project conducted at the Leiden University Medical Center (LUMC). We wrote software to analyse the data from movement sensors on patients. We are also cooperating in a fraud detection investigation into irregularities in health insurance claims based on transactional data. It revealed claims that failed to comply with the rules, which would have cost the sector millions of euros.

     

    Effective data analysis

    The possibilities of data science have developed quickly over the past years. Three main factors contribute to this. Firstly, more and more data are becoming available. Furthermore, new algorithms are being developed that make data analysis more effective. And last but not least, the computational infrastructure is constantly improving: better and stronger machines, and faster connections.

  •  

    Data and information are important in all parts of today's society. Decisions on all levels are increasingly data-driven, which requires adequate and timely information. Linked to the huge increase of data, new legislation and regulations have been launched, bringing new responsibilities to the university and other parties regarding governance and reporting. The requirements for data handling are too complex for individual researchers and other employees. It is now time to realize that data management is a separate science, similar to car drivers today who cannot fix their engine like in the early days.

     

    Therefore, at University of Groningen/University Medical Center Groningen a 2-year program has been initiated to set-up a service and IT infrastructure, firstly focusing on researchers in Human Subjects Research. The program facilitates research on human subjects by providing the researcher an up-to-date IT infrastructure that supports the collection, processing, storage and (re)use of research data, while ensuring data security and properly protecting the privacy of participants. The IT infrastructure will enable innovative research and is in sync with the current vision on research data management and FAIR data.

  •  

    The impact of the phenomenal rise in the volume and variety of big data is evident in every sector of society, from health and well-being to government, from the financial sector to energy, and in fact every facet of daily life. Big data is not simply 'a lot of data', it also means new combinations and continuous flows of ever changing data. After deep processing and analysis, big data comes back to us in the form of information and views that describe our world, recommendations and decision support on what to do, and predictions on what lies ahead.

     

    Profound changes

    In our research we study such questions such as which genetic algorithms can deliver the best predictive value for non-invasive prenatal testing (NIPT). With our Network Telescope we study how internet traffic patterns can teach us if and where cyber-attacks take place. And with our Social Glasses we develop algorithms that help in guiding flows of visitors during city events. In this way, big data profoundly changes the way we govern and structure our society, do business, carry out research, and behave.

     

    With the growing pervasiveness of big data, our digital society also becomes increasingly dependent on outcomes of big data methods. Dependency exists on what data can be collected and accessed and by whom. It also exists in how the deep analysis methods process data based on programmed assumptions and (often implicit) values about the world. And finally, when big data systems break down, many aspects of our digital society come to a grinding halt. Increasingly research emphasizes methods that adhere to the principle of FAIR (findable, accessible, interoperable, reusable) data, FACT (fair, accurate, confidential, transparent) algorithms, and ROBUST system design.

     

    Delft Data Science

    At Delft Data Science (DDS), we study the science and engineering methods for data collection, programming, processing and visualization in the fields of health, smart cities, smart culture, sports, on-line education, security, mobility and transport, and robotics. In many of these cases large amounts of data are collected from human beings, either measured from body or from behavior. Examples are recorded health parameters, sequenced genomes, behavior of individuals in cities, traffic and at sports (events), surfing and clicking behavior on the internet, and all sorts of personal, professional and entertainment preferences.

     

    Increasing demand for protection

    We research methods that anticipate the increasing demand for the protection of such data. As a solution, our privacy-protected signal processing algorithms process big data under encryption. The advantage of such an approach is that privacy or competition-sensitive data can be combined, processed and studied without disclosing the individual data elements themselves. This enabled us, for instance, to developed a facial recognition system where the face to be recognized is not disclosed to the system. We apply the same approach to genome-wide association study (GWAS) data, and to data logged by software running on user-devices.

  •  

    The omnipresence of data is fueling our modern society, provides new business opportunities, and is rapidly changing scientific research. We should invest in our core data-science capabilities and not just in selected application domains. Although this is widely acknowledged, investments in actual data science research are limited. We have a great network of Dutch data science centers and in a few data-science areas we still have a leading position. Unfortunately, these strengths are not cherished and data-science-related investments in other regions of the world are much more substantial. Time for action!

     

    Process mining to extract value from event data

    Process mining is one of the sub-disciplines where the Netherlands is leading. Event data related to processes in society, organizations, mobility, industry, and health can be used to improve our daily lives and create economic value. Process mining converts such data into process models showing what is really happening, answering questions related to 'What?', 'Why?', 'When? and 'How?'. Process-mining results can be used to remove bottlenecks and ensure compliance based on the 'digital evidence' found in today's information systems.

     

    Data Science Center Eindhoven (DSC/e)

    In the Data Science Center Eindhoven (DSC/e) over 30 research groups have joined forces to take on the challenges posed by the pervasiveness of data and the impact on our digital society. The DSC/e collaborates with industry in the Brainport region and beyond, runs five larger cross-cutting research programs, and serves as a meeting place for data science experts.

     

    Responsible Data Science: Fairness, Accuracy, Confidentiality, and Transparency

    Big data and the tools provided by data science influence the way we live, work, communicate, learn, and decide. Therefore, we should use these in a responsible manner. The Responsible Data Science (RDS) initiative evolves around four main challenges:

    1. Data science without prejudice – How to avoid unfair conclusions even if they are true?,
    2. Data science without guesswork – How to answer questions with a guaranteed level of accuracy?,
    3. Data science that ensures confidentiality – How to answer questions without revealing secrets?,
    4. Data science that provides transparency – How to clarify answers such that they become indisputable?

    Questions related to Fairness, Accuracy, Confidentiality, and Transparency (FACT) need to be addressed urgently. Therefore, the RDS initiative is crucial for our digital society.

  •  

    I am scientific director of BISS, the Business Intelligence and Smart Services Institute, a joint initiative by Maastricht University, Open University, and University of Applied Sciences Zuyd. BISS performs applied and fundamental research on the design of digital, or 'smart' services using advanced methods in data science, in particular from Artificial Intelligence. Research topics are defined and explored in collaboration with companies located at the Brightlands Smart Services Campus in Heerlen. Services developed at BISS aim to combine technological innovation, business innovation and social innovation in areas like consumer finance, pensions, health, and citizenship. BISS integrates expertise in statistics, artificial intelligence, information management, marketing, behavioural economics and finance, as well as ethics and law.

     

    My own research over the past fifteen years has been mainly in the field of economic and algorithmic mechanism design, modelling situations in which divergent preferences have to be aligned towards a common economic or social goal, and asymmetric information poses opportunities for strategic behaviour. The typical business example is the procurement of goods and services by means of an auction, with identifying the least expensive or most capable contractor as goal. The choice of the protocol as well as the behaviour of the participants can have significant influence on the extent to which a social goal is achieved. In the digital economy, many platforms are using auction and matching protocols for real-time clearing of markets, adding algorithmic challenges to the design of mechanisms.

  •  

    I have a chair in statistics for the social and behavioural sciences at Utrecht University and a position as professor in social statistics at the University of Southampton. My research focuses on the proper use of large data sets. My research group has a strong reputation in the support of other researchers in the field of data science. Together with scientists from the Faculty of Science, Faculty of Geosciences and Faculty of Humanities and UMC Utrecht I am building the Utrecht Platform for Applied Data Science. We will bring together researchers, students and professionals from outside the university to create a lively data science community. Furthermore, we will promote data science by providing a wide range of data science courses on various levels for specialists, students and scientists. The platform will help everyone with questions that are relevant to society.

  •  

    Natural language is more and more recognized as a phenomenon that represents valuable social and cultural data. Through the synergy with computer science, the field of language and speech technology has grown into a crucial pillar of the field of data science, The multidisciplinary collaboration with the humanities and the social sciences is now demonstrating how the development of models and mining can be put to use for studying and supporting the interaction with cultural heritage data, the design of language courses for people with a migration background, the detection of language patterns that signal mental health problems, to name just a few examples. Also the wealth of language data in social media data can be explored by text analytics for the purpose of understanding the underlying societal phenomena.

     

    More generally speaking, research into the mining of textual and spoken materials contributes to the potential of studying and supporting the societal roles of language: as a carrier of cultural content and information, as a reflection of scientific and societal knowledge, as an instrument for human communication, as one of the central components of the identity of individuals, groups, cultures or nations, as an instrument for human expression. Attention for the potential for reuse of language data through infrastructural facilities, including access, discovery and analysis services, also contributes to the uptake of the multidisciplinary research results coming from in this field.

  •  

    The Open University Computer Science Research Group (OU-CS) is a recently created research group of about 15 researchers with as main research area 'Trustworthy Systems in the Digital Society'. This area comprises subjects such as Digital Privacy and Security, Automatic User Interface Testing, Distributed Programming, Formal Verification of Safety Critical Systems, Energy Consumption Analysis of Software Controlled Systems, Machine Learning, Domain Reasoners for Intelligent Feedback and Computer Science Teaching Methodology.

     

    The OU-CS research group participates in the Cyber Science Center (CSC) which is a collaboration between NHL University of Applied Sciences, Dutch Police Academy and Open University. Three specific groups that complement each other and lead to a wide and strong network because of the diversity of knowledge and skills, a variety of practical ability (evaluation, grant acquisition, educational development) and a variety of complementary network inputs.

  •  

    Digital information technology is permeating society through machines and appliances. Such omnipresent machines are in fact fast becoming intelligent robots: they generate large amounts of (sensor) data that they can process through powerful learning algorithms, and they are connected to other machines and the network.

     

    Within physical reach

    Intelligent machines have been hidden behind car factory walls for half a century, but they will now come within our physical reach. Vacuum robots are already a familiar sight, and the self-driving car is not far away. In all sectors, including industry, healthcare and even in our homes, we will encounter machines with exceptional powers to perceive, learn, and make decisions. Despite the large potential benefits, this raises important questions about the safety, usability, control, and distribution of such systems.

     

    Human-robot interaction

    How can you make an intelligent leg prosthesis only move when it should? How can we design robot swarms that can join teams of human rescuers? How can we design an intelligent car so that the human user and the car can understand each other's intentions? In a factory setting, the challenge is to let the robot exploit its tireless precision to the maximum, while human workers get to make the most of their creativity – and to do this safely. At the TU Delft Robotics Institute we research all aspects of robotics, but with a strong focus on such human-robot interaction. We also aim to distribute the potential of robotics fairly throughout our societies, for example by developing open-source robotics software.

  •  

    The digital society offers huge amounts of data that can be used for better decision making. In the more theoretical part of my research, I develop and study new 'advanced analytics' models and techniques for smarter decisions and better results. Despite we are living in a big data era nowadays, still 'veracity' is one the biggest problems for decision makers. Therefore, I am especially interested in prescriptive analytics techniques that are not only scalable but also can generate solutions that are robust against all kinds of uncertainties.

     

    From the practical side, I apply prescriptive analytics to important societal problems to make the world a little bit better. Recently, I did a big project for the Dutch government to optimize the dike heights in the Netherlands. The results are now stated in the Dutch Water Act and has been estimated to save more than 7 billion euros. A second important application area for me is developing robust optimization techniques for finding optimal radiotherapy treatment plans. Nowadays, hundreds of hospitals are using large-scale optimization techniques for finding optimal plans for thousands of patients. Finally, I am involved in developing optimization models and techniques for the UN World Food Programme to optimize the food supply for the hungry people. The model developed has been extensively applied in e.g. Syria, Yemen, Iraq and Ethiopia. In Syria, due to our prescriptive model more than 1 million people could be fed extra. Although these three applications are very different, the prescriptive models and techniques used are generic.

  •  

    Rotterdam has recently won several 'smart city' awards, yet random passers-by at the Coolsingel possibly have no clue what a 'smart city' is. Maybe it has 'something to do with computers', or is it a recent film? These are at least the best answers the British Institute of Engineering and Technology received when asking ordinary English city dwellers the same question.

     

    Invisibility

    That makes sense because many smart city developments are invisible. Cables are underground and one cannot see, hear or feel wireless signals. There are some giveaways in traffic lights, public transport information, WiFi routers, traffic loops and security cameras, but it is becoming increasingly important to expose áll of the smart city and engage citizens in the design of the new technologies and practices.

     

    Personal vehicle warning

    A case in point is the way in which Rotterdam and Utrecht warned their visitors last year that they were entering the city's environmental zone. In both cities, speed cameras scanned license plates and checked them real-time against the database of the Netherlands Vehicle Authority (RDW). Next, YES or NO appeared on the road-screen in Utrecht, while in Rotterdam the screen said that licence plate xx-yy-11 was (or was not) allowed to enter. What is the effect of such a personal address? Do drivers feel personally responsible for their effect on the environment, or do they see the message as a breach of their privacy? And conversely, do visitors to Utrecht appreciate their anonymity, unaware that their data are being used?

     

    SHARED design

    One can raise such questions about the meaning of all such technologies and data practice for city-users and it is imperative to engage them in the SHARED design of the smart city (Sustainable, Harmonious, Affective, Relevant, Empowering and Diverse). To do so, my group in the Leiden-Delft Erasmus Centre for Big, Open and Linked Data Cities conducts participatory action research to make the smart and datafied city visible to city-users in all their variety. We use instruments like 'data-walks' and 'data dialogues' to find out how ordinary citizens experience their smart city, while simultaneously raising their data-literacy and data-empowerment.

Digital Society

Researchers introduce themselves