Keynote Speech1 : A Wakeup Call: Databases in an Untrusted Universe
Chair : Yunmook Nah, Dankook University
Room : Crystal(F1)
Time: Friday, 10:00 - 11:00
Join Virtual Conference
Amr El Abbadi
Department of Computer Science,
University of California, Santa Barbara
Once upon a time databases were structured, one size fit all and they resided on machines that were trustworthy and even when they failed, they simply crashed. This era has come and gone as eloquently stated by Mike Stonebraker. We now have key-value stores, graph databases, text databases, and a myriad of unstructured data repositories. However, we, as a database community still cling to our 20th century belief that databases always reside on trustworthy, honest servers. This notion has been challenged and abandoned by many other Computer Science communities, most notably the security and the distributed systems communities. The rise of the cloud computing paradigm as well as the rapid popularity of blockchains demand a rethinking of our naive, comfortable beliefs in an ideal benign infrastructure. In the cloud, clients store their sensitive data in remote servers owned and operated by cloud providers. The Security and Crypto Communities have made significant inroads to protect both data and access privacy from malicious untrusted storage providers using encryption and oblivious data stores. The Distributed Systems and the Systems Communities have developed consensus protocols to ensure the fault-tolerant maintenance of data residing on untrusted, malicious infrastructure. However, these solutions face significant scalability and performance challenges when incorporated in large scale data repositories. Novel database designs need to directly address the natural tension between performance, fault-tolerance and trustworthiness. This is a perfect setting for the database community to lead and guide. In this talk, I will discuss the state of the art in terms of data management in malicious, untrusted settings, its limitations and potential approaches to mitigate these shortcomings. As examples, I will use cloud and distributed databases that reside on untrustworthy malicious infrastructure and discuss specific approaches for standard database problems like commitment and replication. I will also explore blockchains, which can be viewed as asset management databases in untrusted infrastructures.
Amr El Abbadi is a Professor of Computer Science at the University of California, Santa Barbara. He received his B. Eng. from Alexandria University, Egypt, and his Ph.D. from Cornell University. His research interests are in the fields of fault-tolerant distributed systems and databases, focusing recently on Cloud data management and blockchain based systems. Prof. El Abbadi is an ACM Fellow, AAAS Fellow, and IEEE Fellow. He was Chair of the Computer Science Department at UCSB from 2007 to 2011. He has served as a journal editor for several database journals, including, The VLDB Journal, IEEE Transactions on Computers and The Computer Journal. He has been Program Chair for multiple database and distributed systems conferences. He currently serves on the executive committee of the IEEE Technical Committee on Data Engineering (TCDE) and was a board member of the VLDB Endowment from 2002 to 2008. In 2007, Prof. El Abbadi received the UCSB Senate Outstanding Mentorship Award for his excellence in mentoring graduate students. In 2013, his student, Sudipto Das received the SIGMOD Jim Gray Doctoral Dissertation Award. Prof. El Abbadi is also a co-recipient of the Test of Time Award at EDBT/ICDT 2015. He has published over 300 articles in databases and distributed systems and has supervised over 35 PhD students.
Keynote Speech 2 : No Data Left Behind – Exploiting Unstructured Data Using Database Systems
Chair : Sang Kyun Cha, Seoul National University
Room : Crystal(F1)
Time: Friday, 14:00 - 15:00
Join Virtual Conference
Institute of System Architecture
Technische Universität Dresden (TU Dresden)
In our data-driven culture, more and more data sources of semi-structured or unstructured nature are getting incorporated
into decision workflows. However, relational database systems are still the “lingua franca” for data storage,
query processing, and large-scale analytics in almost every organization and they will probably remain for the next decades.
Tapping into the value of unstructured data in the realm of databases systems remains a challenging task.
In this talk, I will present our journey of building database-centric systems that are able to exploit external
knowledge during query processing with an emphasis on Web tables and spreadsheets as well as textual documents.
I will introduce the problem of table extraction and layout identification, giving an idea on how to solve it and
present our initiative on building a corpus consisting of more than 125M Web tables. The extracted tables can be
leveraged using relational augmentation techniques integrated into a database system by introducing a novel database
engine operator dealing with top-k results. For textual data, I will report on recent developments in the field of language models
such as word embeddings and outline how this can be utilized to enrich database query capabilities and enabling inductive reasoning
on text values stored in database tables.
Wolfgang Lehner is full professor and head of the Database Technology Group as well as director of the Institute of
System Architecture at TU Dresden, Germany. His research focuses on database system architectures specifically looking
at crosscutting aspects from data engineering algorithms and data structures down to hardware-related
aspects mostly in main-memory centric settings. He is heading a Research Training Group on large-scale adaptive system
software design and acts as a principal investigator in Germany’s national “Competence Center for Scalable Data
Services and Solutions” (ScaDS). Wolfgang also maintains a close research relationship with the international
SAP HANA development team. He serves the community in many PCs, is the Managing Editor of
“Proceedings of the VLDB Endowment” (PVLDB), and serves on the grants committee of collaborative research centers
within the German Research Foundation (DFG). He is an appointed member of the Academy of Europe.
Keynote Speech 3 : In-NVM DBMS – Is There A Case?
Chair : Jeffrey Xu Yu, The Chinese University of Hong Kong, Hong Kong
Time: Saturday, 10:00 - 11:00
Join Virtual Conference
Department of Computer Science
National University of Singapore (NUS)
Today’s database management systems are essentially
based on a two-layered storage architecture: (a) data are stored on cheap (and high capacity but slow)
persistent storage like solid state drives (NAND flash) or magnetic disks; and (b) data are loaded
and processed in volatile (and fast but expensive) DRAM. More recently, the emergence of byte-addressable
non-volatile memory (NVM) technologies, such as Intel/Micron’s 3D-XPoint memory and phase change memory (PCM),
has prompted researches to investigate how best to exploit this technology for database systems. On one hand,
NVM can be used as a form of persistent cache for disks so that “hot” data can be stored on NVM, while “cold”
data on disks (leading to a 3-tier storage). On the other hand, it is not impossible to have just a single level
storage architecture by replacing DRAM with NVM; given NVM is non-volatile, the persistent storage tier can also be removed.
This talk focuses on the latter, and examines the opportunities and challenges in building
an in-nvm database management system.
Kian-Lee Tan is a Professor of Computer Science at the School of Computing, National University of Singapore (NUS).
He received his Ph.D. in computer science in 1994 from NUS. His current research interests include query processing
and optimization in multiprocessor and distributed systems, database performance, data science, and database security.
Kian-Lee has published over 300 research articles in international journals and conference proceedings,
and co-authored several books/monographs. Kian-Lee was a recipient of the NUS Outstanding University Researchers Award in 1998,
and the NUS Graduate School (NGS) Excellent Mentor Award in 2011. He was a co-recipient of Singapore's President Science
Award in 2011. He is also a 2013 IEEE Technical Achievement Award recipient. Kian-Lee is a member of the
VLDB Endowment Board (2012-2017) and PVLDB Advisory Committee (2014-2017). He is an associate editor of the ACM
Transactions on Database Systems (TODS) and the WWW Journal. He has also served in the editorial board of the Very Large Data Base (VLDB)
Journal (associate editor: 2007-2009; editor-in-chief: 2009-2015) and the IEEE Transactions on
Knowledge and Data Engineering (2009-2013). Kian-Lee was the Technical Program Committee co-chair for the
27th International Conference on Data Engineering (ICDE 2011), the 36th International Conference on Very Large Data Bases (VLDB 2010),
the 11th International Conference on Database Systems for Advanced Applications (DASFAA 2006) and 3rd International Conference on
Mobile Data Management (MDM 2002). He has also served as a member of Steering Committee of DASFAA (2005-2010).
Kian-Lee is a member of ACM and a senior member of the IEEE.
Keynote Speech 4 : Data Science in University: Yet Another Silo or A Hub for Transformation of Education?
Chair : Kyuseok Shim, Seoul National University
Time: Saturday, 14:00 - 15:00
Join Virtual Conference
Sang Kyun Cha
Founding Dean of Graduate School of Data Science
Seoul National University
With the advances of computing, big data, and artificial intelligence, data science is emerging as a new essential scientific discipline for almost all academic disciplines and industrial sectors. Many universities around the world adopt data science in their academic program in one way or another. The question is how we establish data science within university: yet another silo or a vehicle for broader transformation of university education. Funded or stimulated by Moore and Sloan foundations, a small number of US universities started experimenting with the creation of new academic programs on data science from 2013. Around the same time, Seoul National University started an independent journey of experiments on university-wide data science education and research. This journey resulted in SNU establishing its new Graduate School of Data Science in 2020 with new faculty headcounts, to formalize transformative education and create a hub for leading such education. In this talk, I will present the lessons learned from this journey as well as the vision.
Professor Sang Kyun Cha has been the founding dean of the Graduate School of Data Science of Seoul National University since February of 2020. He led the effort of establishing this new graduate school to help transform Korea’s leading higher education institution in the age of data-driven innovation and AI since he took the role of founding director of SNU Big Data Institute in April of 2014.
Before taking this transformative role at the university, Prof. Cha was an innovator and entrepreneur who founded ‘Transact In Memory, Inc.’ in Silicon Valley in early 2000’s based on his research on in-memory data management. After SAP AG’s acquisition of the company in late 2005, he led SAP’s secret research toward SAP HANA and took the role of one of two co-founding chief architects of SAP HANA until early 2014 when his role at SAP was more or less done and the industry shifted toward the in-memory paradigm triggered by SAP HANA.
Professor Cha has been promoting the preemptive transformation of education for the new digital era and has been consulting Korean government on education, research priorities, and innovation strategies. Despite his administrative duties, Prof. Cha still teaches two courses ‘Big Data and Knowledge Management’ and ‘Data-Driven Innovation and Entrepreneurship’. He also runs the university-wide seminar on Data Science inviting speakers globally. He serves as the chair of IEEE International Conference on Data Engineering Ten-Year Influential Paper Award Committee.
He received his BS and MS from Seoul National University and his Ph.D. from Stanford University.