Latest Issue

2023, Volume 14,  Issue 5

Research Paper
The evolution of a geoscience standard: An instructive tale of science keyword development and adoption
Mark A. Parsons, Ruth Duerr, Øystein Godøy
2023, 14(5): 101400. doi: 10.1016/j.gsf.2022.101400
Abstract(173) HTML PDF(33)
In 1987, NASA sponsored an international workshop that inspired the Directory Interchange Format or DIF – a metadata format to enable "catalog interoperability". The DIF formed the basis of the International Directory Network (IDN) and the Global Change Master Directory (GCMD) and included a set of science keywords. The primary intent was to catalog NASA Earth science and related data, but the keywords have been implemented in many different systems and adopted in varying ways by many different organizations around the world. This review provides an ethnographic examination of how the keywords have evolved and been managed and how they have been adopted over the last 35 years. It illustrates how semantic approaches have evolved over time and provides insights on how standards and associated processes can be sustained and adaptable. Ongoing institutional commitment is essential, but so is transparency and technical flexibility. Understanding and empowering the different roles involved in standards creation, maintenance, and use of standards as well as the services that standards enable is also critical. It is apparent that semantic representations need to be mindful of different contexts and carefully define verbs as well as nouns and categories. Understanding and representing relationships is central to interdisciplinary interoperability.
Fact-condition statements and super relation extraction for geothermic knowledge graphs construction
Qizhi Chen, Hong Yao, Shengwen Li, Xinchuan Li, Xiaojun Kang, Wenwen Lai, Jian Kuang
2023, 14(5): 101412. doi: 10.1016/j.gsf.2022.101412
Abstract(94) HTML PDF(11)
Researchers utilize information from the geoscience literature to deduce the regional or global geological evolution. Traditionally this process has relied on the labor of researchers. As the number of papers continues to increase, acquiring domain-specific knowledge becomes a heavy burden. Knowledge Graph (KG) is proposed as a new knowledge representation technology to change this situation. However, the super relation is not considered in the previous KG, which bridges the geological phenomenon (fact) and its precondition (condition). For instance, in the statement (“the late Archean was a crucial transition period in the history of global geodynamics”), the condition statement (“crucial transition for global geodynamics”) works as the complementary fact statement (“the late Archean was a crucial transition period”), which defines the scale of crucial transition accurately in the late Archean. In this study, fact-condition statement extraction is introduced to construct a geological knowledge graph. A rule-based multi-input multi-output model (R-MIMO) is proposed for information extraction. In the R-MIMO, fact-condition statements and their super relation are considered and extracted for the first time. To verify its performances, a GeothCF dataset with 1455 fact tuples and 789 condition tuples is constructed. In experiments, the R-MIMO model achieves the best performance by using BERT as encoder and LSTM-d as decoder, achieving F1 80.24% in tuple extraction and F1 70.03% in tag prediction task. Furthermore, the geothermic KG with super relation is automatically constructed for the first time by trained R-MIMO, which can provide structured data for further geothermic research.
Construction and application of an ontology-based domain-specific knowledge graph for petroleum exploration and development
Xianming Tang, Zhiqiang Feng, Yitian Xiao, Ming Wang, Tianrui Ye, Yujie Zhou, Jin Meng, Baosen Zhang, Dongwei Zhang
2023, 14(5): 101426. doi: 10.1016/j.gsf.2022.101426
Abstract(105) HTML PDF(11)
The massive amount and multi-sourced, multi-structured data in the upstream petroleum industry impose great challenge on data integration and smart application. Knowledge graph, as an emerging technology, can potentially provide a way to tackle the challenges associated with oil and gas big data. This paper proposes an engineering-based method that can improve upon traditional natural language processing to construct the domain knowledge graph based on a petroleum exploration and development ontology. The exploration and development knowledge graph is constructed by assembling Sinopec's multi-sourced heterogeneous database, and millions of nodes. The two applications based on the constructed knowledge graph are developed and validated for effectiveness and advantages in providing better knowledge services for the oil and gas industry.
Climate paleogeography knowledge graph and deep time paleoclimate classifications
Chenmin Yu, Laiming Zhang, Mingcai Hou, Jianghai Yang, Hanting Zhong, Chengshan Wang
2023, 14(5): 101450. doi: 10.1016/j.gsf.2022.101450
Abstract(110) HTML PDF(14)
The climate paleogeography, especially the climate classifications, helps to interpret the global and regional climate changes and intuitively compare the climate conditions in different regions. However, the application of climate classification in deep time (i.e., climate paleogeography) is prohibited due to the usually qualitatively constrained paleoclimate and the inconsistent descriptions and semantic heterogeneity of the climate types. In this study, a climate paleogeography knowledge graph is established under the framework of the Deep-Time Digital Earth program (DDE). The hierarchical knowledge graph consists of five paleoclimate classifications based on various strategies. The classifications are described and their strengths and weaknesses are fully evaluated in four aspects: "simplicity, applicability, quantifiability, and comparability". We also reconstruct the global climate distributions in the Late Cretaceous according to these classifications. The results are compared and the relationships among these climate types in different classifications are evaluated. Our study unifies scientific concepts from different paleoclimate classifications, which provides an important theoretical basis for the application of paleoclimate classifications in deep time.
A knowledge graph and service for regional geologic time standards
Chao Ma, Amruta Suresh Kale, Jiyin Zhang, Xiaogang Ma
2023, 14(5): 101453. doi: 10.1016/j.gsf.2022.101453
Geologic time is an important dimension in geological research. Geologic time data are commonly collected from multiple sources in data-intensive studies of Earth's history and raise an issue of data cleansing and integration. A knowledge graph of the international geological time scale has been established to harmonize heterogeneous data to facilitate effective and efficient data-driven discovery. Although many regional geologic time standards are also used in various databases and literature, there is limited discussion or development of knowledge graph for them. In this research, we construct a knowledge graph for the geologic time standards in 17 regions at the Epoch and Age levels. This regional geologic time knowledge graph is integrated with the international geologic time knowledge graph as a comprehensive deep-time knowledge base. A SPARQL endpoint has been established to provide open and free online service to the knowledge base. Several use cases are presented here to demonstrate the functionality of the knowledge graph we built as well as its application in open data exploration. Our work addresses the shortage of machine-readable knowledge graphs for regional geologic time standards and will help accelerate geologic data integration from multiple sources in data-intensive studies. All data and code in this paper are made open source and are accessible on GitHub and Zenodo.
A comprehensive construction of the domain ontology for stratigraphy
Huiqing Xu, Yingying Zhao, Hao Huang, Shaochun Dong, Yukun Shi, Chunju Huang, Huaichun Wu, Zhiqi Qian, Qiang Fang, Huaguo Wen, Zhongtang Su, Shuang Dai, Ronghua Wang, Chao Li, Chao Sun, Junxuan Fan
2023, 14(5): 101461. doi: 10.1016/j.gsf.2022.101461
Stratigraphic knowledge, the cornerstone of geoscience, needs to be represented by the Knowledge Graph based upon ontology, in order to apply the state-of-the-art big-data techniques. This study aims to comprehensively construct the ontologies for the stratigraphic domain. This has been achieved by a federated, crowd intelligence-based collaboration among domain experts of major stratigraphic subdisciplines. The initial step is to enumerate key terms from authoritative references and incorporate them into the Geoscience Professional Knowledge Graphs (GPKGs) of Deep-time Digital Earth Project. During this process, semantic heterogeneities were meticulously addressed by professional judgement aided by an automatic detection of Homonyms at the GPKGs platform. Afterwards, these terms were further differentiated as either classes or properties and arranged in a hierarchical framework in a top-down process. Consequently, seven ontologies are constructed for major stratigraphic branches, i.e., Lithostratigraphy, Biostratigraphy, Chronostratigraphy, Chemostratigraphy, Magnetostratigraphy, Cyclostratigraphy and Sequence Stratigraphy. The ontology of Biostratigraphy, among them, is elaborated here, as no biostratigraphic ontology has been attempted before to our knowledge. The constructed biostratigraphic ontology comprises following major root classes: Fossil, Biostratigraphic unit, Biostratigraphic horizon. Altogether, they contribute to the eventual dating and correlating of strata in another root class: Biostratigraphic correlation. In summary, the achievements of this study are probably heretofore the most comprehensive ontologies for the stratigraphic domain. Moreover, a proto model of semantic search engine was conceived to discuss potential application of our work for better querying stratigraphic references, utilizing the semantic liaison of the classes in the constructed ontologies.
A unified framework of temporal information expression in geosciences knowledge system
Shu Wang, Yunqiang Zhu, Yanmin Qi, Zhiwei Hou, Kai Sun, Weirong Li, Lei Hu, Jie Yang, Hairong Lv
2023, 14(5): 101465. doi: 10.1016/j.gsf.2022.101465
Time is an essential reference system for recording objects, events, and processes in the field of geosciences. There are currently various time references, such as solar calendar, geological time, and regional calendar, to represent the knowledge in different domains and regions, which subsequently entails a time conversion process required to interpret temporal information under different time references. However, the current time conversion method is limited by the application scope of existing time ontologies (e.g., “Jurassic” is a period in geological ontology, but a point value in calendar ontology) and the reliance on experience in conversion processes. These issues restrict accurate and efficient calculation of temporal information across different time references. To address these issues, this paper proposes a Unified Time Framework (UTF) in the geosciences knowledge system. According to a systematic time element parsing from massive time references, the proposed UTF designs an independent time root node to get rid of irrelevant nodes when accessing different time types and to adapt to the time expression of different geoscience disciplines. Furthermore, this UTF carries out several designs: to ensure the accuracy of time expressions by designing quantitative relationship definitions; to enable time calculations across different time elements by designing unified time nodes and structures, and to link to the required external ontologies by designing adequate interfaces. By comparing the time conversion methods, the experiment proves the UTF greatly supports accurate and efficient calculation of temporal information across different time references in SPARQL queries. Moreover, it shows a higher and more stable performance of temporal information queries than the time conversion method. With the advent of the Big Data era in the geosciences, the UTF can be used more widely to discover new geosciences knowledge across different time references.
A knowledge graph for standard carbonate microfacies and its application in the automatical reconstruction of the relative sea-level curve
Han Wang, Hanting Zhong, Anqing Chen, Keran Li, Hang He, Zhe Qi, Dongyu Zheng, Hongyi Zhao, Mingcai Hou
2023, 14(5): 101535. doi: 10.1016/j.gsf.2023.101535
The reconstruction of high-resolution sea-level variation curves in deep time based on the standard carbonate microfacies knowledge graph (SMFKG) is of great scientific significance for exploring the Earth system evolution and predicting future sea-level and climate changes. In this study, the concepts, attributes, and relationships among standard carbonate microfacies (SMF) are comprehensively analyzed; an ontology layer is established and its data layer is constructed using thin-section descriptions; and finally, the SMFKG is established. Additionally, based on the knowledge graph, an application for automatically identifying SMF using identification markers and reconstructing the high-resolution relative sea-level variation curve using the SMF and facies zones is compiled. Then, all thin sections of the late Ediacaran Dengying Formation in the western margin of the Yangtze Platform are observed and described in detail, the SMF and facies zones are identified automatically, and the relative sea-level curve is reconstructed automatically using the SMFKG. The reconstruction results show that the Yangtze Platform experienced four sea-level rise and fall cycles in the late Ediacaran, of which two intense regressions led to subaerial-exposed unconformities in the interior and top of the Dengying Formation, which is highly consistent with previous research results. This shows that the high-resolution relative sea-level variation curve in deep time can be reconstructed efficiently and intelligently using the SMFKG. Additionally, in the near future, the combination of an automatic digital slide-scanning system, machine-learning techniques, and the SMFKG can achieve one-stop fully automatic SMF recognition and reconstruction of high-resolution relative sea-level variation curves in deep time, which has a high application value.
Quantifying the influence of magmatism and tectonism on ultraslow-spreading-ridge hydrothermal activity: Evidence from the Southwest Indian Ridge
Xing Xu, Shili Liao, Chunhui Tao, Lushi Liu
2023, 14(5): 101584. doi: 10.1016/j.gsf.2023.101584
Hydrothermal activity in mid-ocean ridges (MORs) is an important intermediary for the mass and heat exchange between the ocean and lithosphere. The development of hydrothermal activity on MORs is primarily controlled by coupled magmatic and tectonic activities. In ultraslow-spreading ridges, deep-dipping low-angle normal faults with large offsets, typically detachment faults in the inside corners of ridge offsets, favor the formation of tectonic-related hydrothermal activities, whereas volcanic-related hydrothermal fields are typically developed in neovolcanic zones in this category of the ridge system. However, whether tectonic or magmatic activity is dominant and to what extent they control the formation of hydrothermal activities on ultraslow-spreading ridges remain unclear. Segments in the west and east of the Gallieni transform fault (TF) located in the ultraslow-spreading Southwest Indian Ridge (SWIR), namely, western area (WA) and eastern area (EA), exhibit distinct magma-supply conditions that provide favorable conditions for examining the influence of magmatic and tectonic activities. We generated prediction models for these areas using the spatial analysis of the water depth, minor faults, large faults, ridge axis, nontransform discontinuity (NTD) inside corners, TF inside corners, Bouguer gravity anomaly, magnetic anomalies, and seismic activities. By employing the weights of evidence method, we reported that the formation of seafloor hydrothermal systems in SWIR was primarily correlated to the NTD inside corner, ridge axis, and minor fault (i.e., contrast values (C) of 4.186, 3.727, and 3.482 in WA and 4.278, 3.769, and 3.135 in EA). Furthermore, EA was significantly affected by the TF inside corner (C = 3.501), whereas WA was influenced by large faults (C = 4.062). Our results demonstrated that tectonism was the primary controlling factor in the development of hydrothermal activities in the study area, and the contribution of magmatism was secondary, even in WA, which has a relatively robust magma supply. We delimited prominent prospecting areas at each side based on posterior probability. Our results provided insights into the formation mechanisms of hydrothermal activities and support prospecting in MORs.
Cyclicity related to solar activity in lacustrine organic-rich shales and their significance to shale-oil reservoir formation
Miruo Lin, Kelai Xi, Yingchang Cao, Rukai Zhu, Xiaobing Niu, Honggang Xin, Weijiao Ma
2023, 14(5): 101586. doi: 10.1016/j.gsf.2023.101586
The formation mechanism of micron- to centimeter-scale sedimentary cycles in lacustrine shales is a hot topic of research, because these small-scale sedimentary cycles significantly influence shale-oil distribution heterogeneity. High-frequency paleoenvironmental evolution is an important controlling factor for the formation of small-scale sedimentary cycles. However, the driving factors of high-frequency paleoenvironmental evolution and the formation process of sedimentary cycles under its constraint remain speculative. In this study, which focuses on lacustrine shales, we find that the alternating deposition of variable thickness of organic-rich lamina (ORL) and silty-grained felsic lamina (SSFL) form sedimentary cycles on the micron to centimeter scale in the Chang 73 sub-member of the Yanchang Formation in the Ordos Basin. Based on detailed petrographic characterization, in-situ geochemical parameter testing, and high-resolution cycle analysis, the formation process of cyclical sedimentary records and related paleoenvironmental evolution are investigated. Three solar activity cycles were identified from the shales, namely the 360–500 yr, 81–110 yr, and 30–57 yr cycles (cycles Ⅰ, Ⅱ, and Ⅲ, respectively). High-frequency paleoenvironmental evolution caused by solar activity induced lake-level fluctuation, which further controlled silty-grained sediment deposition and organic matter preservation in deep lake areas. Cycle Ⅰ controlled relatively long-term lake-level fluctuation, driving several pairs of SSFL and ORL deposition at the centimeter scale. Cycles Ⅱ and Ⅲ were short-term cycles and acted on the millimeter to micrometer scale, further complicating the sedimentary strata forming during the period of lake-level fall induced by cycle Ⅰ. The cyclic deposition of SSFL and ORL correspond to cycle Ⅲ. Lake-level fluctuation influenced by cycle Ⅱ mainly caused SSFL thickness variation in each lamina couplet. During the period of lake-level rise induced by cycle Ⅰ, periods of lake level rise during cycles Ⅱ, and Ⅲ show cyclic variation in reducibility, and are not thought to control the supply of coarse-grained sediments in to the deep lake areas. Frequent lake-level fluctuation promotes lamina couplet formation in thickly-bedded shales, which creates favorable conditions for shale-oil accumulation. Oil produced from ORL can migrate-locally into dissolved feldspar porosity in SSFL and therefore is able to accumulate in shales, which creates high potential for future oil exploration in thickly-bedded lacustrine shales.
Role of metasomatized mantle lithosphere in the formation of giant lode gold deposits: Insights from sulfur isotope and geochemistry of sulfides
Baisong Du, Zuoman Wang, M. Santosh, Yuke Shen, Shufei Liu, Jiajun Liu, Kexin Xu, Jun Deng
2023, 14(5): 101587. doi: 10.1016/j.gsf.2023.101587
Abstract(39) HTML PDF(13)
The Wulong deposit is one of the largest quartz vein-type gold deposits with at least 80 tons of identified gold reserves in the eastern part of the Liaodong Peninsula. Gold orebodies are mainly hosted in the Late Jurassic gneissic two-mica granite and Early Cretaceous diorite dykes, and are structurally controlled by the NNE- and NW-trending faults. Gold mineralization mainly occurs as veins with lenticular shapes and is closely associated with sulfides and Bi minerals. Previous studies on the deposit mainly focused on its geological characteristics, fluid inclusions and the timing of g