TASK Quarterly   Scientific Bulletin of the Centre of Informatics - Tricity Academic Supercomputer & networK   ISSN 1428-6394

Volume 21, Number 4, 2017


Contents:

  • Michał Lewczuk, Paweł Cichocki and Józef Woźniak Database and BigData Processing System for Analysis of AIS Messages in the netBaltic Research Project - abstract | full text
  • Henryk Krawczyk and Paweł Lubomski Multidisciplinary Open System Transferring Knowledge for R2B Development - abstract | full text
  • Anna Wałek and Paweł Lubomski The Bridge to Knowledge – Open Access to Scientific Research Results on Multidisciplinary Open System Transferring Knowledge Platform - abstract | full text
  • Henryk Krawczyk and Andrew Targowski Information Society Development Trends, from Data through Knowledge to Wisdom - abstract | full text
  • Jamil Abdulhamid Mohammed Saif and Piotr Sumionka Scalability Evaluation of MATLAB Routines for Parallel Image Processing Environment - abstract | full text
  • Łukasz Wiszniewski and Dariusz Klimowicz Open extensive IoT research and measurement infrastructure for remote collection and automatic analysis of environmental data - abstract | full text
  • Alicja Kwaśniewska, Anna Giczewska and Jacek Rumiński Big Data Significance in Remote Medical Diagnostics Based on Deep Learning Techniques - abstract | full text
  • Piotr Orzechowski Complementary oriented allocation algorithm for cloud computing - abstract | full text
  • Dorota Grygoruk Open Access to Research Data on Forest Ecosystems in Poland - abstract | full text
  • Marcin Krystek, Cezary Mazurek, Raul Palma, Juliusz Pukacki and Jose Manuel Gomez-Perez Research Object as mechanism for ensuring research experiment reproducibility within Virtual Research Environment - abstract | full text
  • Jerzy Proficz and Krzysztof Drypczewski Processing of Satellite Data in the Cloud - abstract | full text
  • Zofia Kasprzak, Mariusz Polarczyk and Krzysztof Gmerek Quantitative and qualitative development of natural and agricultural science resources in the AGRO database (2009–2016) - abstract | full text
  • Dalia Chrzanowska, Rafał Niemiec and Zdzisław Hippe Research on Melanocitic Skin Lesion Infobase Enlargement – New Facts and Concepts - abstract | full text


Abstracts:

  • Michał Lewczuk, Paweł Cichocki and Józef Woźniak Database and BigData Processing System for Analysis of AIS Messages in the netBaltic Research Project

    A specialized database and a software tool for graphical and numerical presentation of maritime measurement results has been designed and implemented as part of the research conducted under the netBaltic project (Internet over the Baltic Sea – the implementation of a multi-system, self-organizing broadband communications network over the sea for enhancing navigation safety through the development of e-navigation services.) The developed software allows tracing graphs of radio-connections between shore stations and vessels (offshore units), based on historical data including the traffi of ships and their specific parameters collected on the Baltic Sea during the last four years. It also enables preparation of data for network simulation experiments using AIS (Automatic Identification of Ships) and GPS (Global Positioning System) loggers installed on shore stations and vessels, taking into account a number of input parameters, such as: time range, coast station selection, ship flags based on MMSI numbers and types and ranges of possible communication technologies used (WiFi, WiMax, Radwin, LTE, etc.). The created tool has a multi-layer architecture that utilizes the MariaDB SQL database, the Apache2 WEB server, and a number of PHP applications. The runtime environment has been built on Linux Debian version 8 and the HP C7000 cluster of the 16 CPU x86 64 architecture. The modularity of the application allows parallel processing and, therefore, optimization of the computing cluster. The database contains more than 70 million records which enables simulation of various topologies (with multi-hop transmissions) and network operations depending on the transmission techniques being used. The database is fully scalable, and allows easy adding of further data collected during subsequent measurement sessions. Additionally, the use of virtualization tools facilitates the future migration to more effiient processing environments, in case of a significant increase in the volume of data. The data recorded in the database allows calculation of statistics for the surveyed networks, and determining the incidence of potential network nodes (e.g. by flag) complete with their available communication techniques – information which is important in determining structures of possible multi-hop networks and their performance. The software finds routes for datagrams according to accepted criteria and exports results to a network traffi simulator, and as such is an important part of the framework used for planning next measurement campaigns and determining which communications equipment would be more suitable for vessels.

  • Henryk Krawczyk and Paweł Lubomski Multidisciplinary Open System Transferring Knowledge for R2B Development

    Despite many efforts, there is still a serious problem in transferring knowledge from research to business. The problem is especially visible in Poland – the cooperation of R2B is ineffective. We are trying to solve this problem using some IT support. The manuscript presents some solutions developed at the Gdańsk University of Technology. In particular, the platform called “MOST Knowledge” is deeply described. Its layer architecture, and some new services which it offers are shown. A new interdisciplinary approach is proposed to communicate and support the cooperation of these different worlds. Summarizing, a short comparison with other available platforms is included and discussed.

  • Anna Wałek and Paweł Lubomski The Bridge to Knowledge – Open Access to Scientific Research Results on Multidisciplinary Open System Transferring Knowledge Platform

    The European policy of Open Access to scientific research is now one of the key issues discussed in public debates on the future development of scientific communication. The implementation of Open Access tools has significant impact on scientific and economic growth. On the one hand, Open Access accelerates disseminating new research findings and facilitates recognition of authors on a more global scale. On the other hand, Open Access helps provide equal access to knowledge and stimulates innovation. Thus, it has an important role in creating the modern information society and economic growth. International organisations, the European Union and governments of individual countries support the idea of Open Access giving recommendations and guidelines concerning making the outputs of research financed from public funds freely available. The paper aims to discuss the process of preparing and implementing the Open Access policy at the institutional level as well as the functionality and tasks of the Open Repository which is now being established on the Multidisciplinary Open System Transferring Knowledge Platform. The acronym of its name in the Polish language is “MOST Wiedzy”, which means “Bridge of Knowledge”. The repository is a project of an archive of scientific publications, research data, scientific dissertations, as well as other documents and sources, created as a result of scientific experiments and other research and development work conducted at the Gdańsk University of Technology. It will also be a solution supporting communication between researchers and a platform for cooperation between science and business.

  • Henryk Krawczyk and Andrew Targowski Information Society Development Trends, from Data through Knowledge to Wisdom

    The paper investigates both the causes and effects of the rapid increase in the data volume (Big Data) and their impact on human cognition. The role of the Internet in distributing and exchanging of such data, and their impact on the growth of the Information Society are emphasized. As a result, Wisdom Science – a new kind of research – emerges which has the potential to facilitate more advanced solutions in the digital world. In consequence, new kinds of info-driven devices, services and systems called “smart” are developed and applied in almost every aspect of human activities around the world. However, this is not enough for humans to use all those well-informed smart devices and systems because, first of all, their decisions should be wise. Therefore, the paper, coming from a cognitive informatics approach defines wisdom and its applications, illustrated by some practical cases. Based on this, relations between knowledge and wisdom are shown, and human abilities corresponding to them are defined. They can decide about a transformation of a knowledge society to a wisdom society.

  • Jamil Abdulhamid Mohammed Saif and Piotr Sumionka Scalability Evaluation of MATLAB Routines for Parallel Image Processing Environment

    Image edge detection plays a crucial role in image analysis and computer vision, it is defined as the process of finding the boundaries between objects within the considered image. The recognized edges may further be used in object recognition or image matching. In this paper a Canny image edge detector is used which gives acceptable results that can be utilized in many disciplines, but this technique is time-consuming especially when a big collection of images is analyzed. For that reason, to enhance the performance of the algorithms, a parallel platform allowing speeding up the computation is used. The scalability of a multicore supercomputer node, which is exploited to run the same routines for a collection of color images (from 2100 to 42 000 images) is investigated.

  • Łukasz Wiszniewski and Dariusz Klimowicz Open extensive IoT research and measurement infrastructure for remote collection and automatic analysis of environmental data

    Internet of Things devices that send small amounts of data do not need high bit rates as it is the range that is more crucial for them. The use of popular, unlicensed 2.4 GHz and 5 GHz bands is fairly legally enforced (transmission power above power limits cannot be increased). In addition, waves of this length are very diffiult to propagate under field conditions (e.g. in urban areas). The market response to these needs are the LPWAN (Low Power WAN) type networks, whose main features are far-reaching wireless coverage and low power measurement end-nodes that can be battery powered for months. One of the promising LPWAN technologies is the LoRaWAN, which uses a publicly available 868 MHz band (in Europe) and has a range of up to 20 km. This article presents how the LoRaWAN network works and describes the installation of the research and measurement infrastructure in this technology which was built in the Gdańsk area using the Academic Computer Center TASK network infrastructure. The methodology and results of the qualitative and performance studies of the constructed network with the use of unmanned aircraft equipped with measuring devices for remote collection of environmental data are also presented. The LoRaWAN TASK has been designed to support the development of other research projects as an access infrastructure for a variety of devices. Registered users can attach their own devices that send specific metrics that are then collected in a cloud-based database, analyzed and visualized.

  • Alicja Kwaśniewska, Anna Giczewska and Jacek Rumiński Big Data Significance in Remote Medical Diagnostics Based on Deep Learning Techniques

    In this paper we discuss the evaluation of neural networks in accordance with medical image classification and analysis. We also summarize the existing databases with images which could be used for training deep models that can be later utilized in remote home-based health care systems. In particular, we propose methods for remote video-based estimation of patient vital signs and other health-related parameters. Additionally, potential challenges of using, storing and transferring sensitive patient data are discussed.

  • Piotr Orzechowski Complementary oriented allocation algorithm for cloud computing

    Nowadays cloud computing is one of the most popular processing models. More and more different kinds of workloads have been migrated to clouds. This trend obliges the community to design algorithms which could optimize the usage of cloud resources and be more effiient and effective. The paper proposes a new model of workload allocation which bases on the complementarity relation and analyzes it. An example of a case of use is shown and an increase in the workload execution is presented.

  • Dorota Grygoruk Open Access to Research Data on Forest Ecosystems in Poland

    Studies of forest ecosystems enable gathering important information on the natural environment the development of which is more and more disturbed by the global climate change. The current research on the ecosystem functioning provides data that may be of much value for future analysis and prognostic studies. Modern measurement techniques used in the forest research have a significant influence on the increase in the database resources, especially those concerning the spatial data. Big data requires the use of advanced analytical technologies, such as data warehouses, computer clusters or cloud computing. Consequently, cooperation of specialists from various scientific disciplines, including forestry, geography, climatology and computer science, has become increasingly necessary. The IT system of the Forest Research Institute (FRI) was modernized within the framework of the Operational Programme – Innovative Economy 2007–2013. Its functionality allows integrating, storing and analyzing ever more big databases from dispersed sources. The idea of open access to data is realized by the FRI mainly through publication of research results in domestic and foreign scientific journals, in specialized information services and on scientific portals. On the other hand, open access to raw data still raises a lot of concern and controversies in the scientific community, especially in the context of copyright infringement.

  • Marcin Krystek, Cezary Mazurek, Raul Palma, Juliusz Pukacki and Jose Manuel Gomez-Perez Research Object as mechanism for ensuring research experiment reproducibility within Virtual Research Environment

    A Research Object (RO) is defined as a semantically rich aggregation of resources that bundles together essential information relating to experiments and investigations. This information is not limited merely to the data used and the methods employed to produce and analyze such data, but it may also include the people involved in the investigation as well as other important metadata that describe the characteristics, inter-dependencies, context and dynamics of the aggregated resources. As such, a research object can encapsulate scientific knowledge and provide a mechanism for sharing and discovering assets of reusable research and scientific knowledge within and across relevant communities, and in a way that supports reliability and reproducibility of investigation results. While there are no pre-defined constraints related to the type of resources a research object can contain, the following usually apply in the context of scientific research: data used and results produced; methods employed to produce and analyze data; scientific workflows implementing such methods; provenance and settings; people involved in the investigation; annotations about these resources, which are essential to the understanding and interpretation of the scientific outcomes captured by a research object. The example research object contains a workflow, input data and results, along with a paper that presents the results and links to the investigators responsible. Annotations on each of the resources (and on the research object itself) provide additional information and characterize, e.g. the provenance of the results. Therefore, exploitation of the RO model should be considered as a way to provide additional reliability and reproducibility of the research. The concept of the RO was introduced to the environment created in the EVER-EST project in the form of Virtual Research Environment (VRE). a group of Earth Scientists, who are observing, analyzing and modeling processes that take place on land and see, was examined against their needs and expectations about the possible improvements in their scientific work. The results show that scientist expectations are focused on knowledge sharing and reuse, and new forms of scholarly communications beyond pdf articles as supporting tools of knowledge cross-fertilization between their members. The Research Object concept seems a natural answer for these needs. However, the model, in order to be suffiient and usable, must become a part of the working environment and needs to be integrated with the actual tools. Therefore, great efforts have been undertaken to create a generic, technical solution – VRE, which implements the expected functionalities. In this article we present a concept of the VRE as a tool that takes advantage of the Research Object model in order to integrate and simplify the information exchange, as well as persist, share and discover assets of the reusable research. Moreover, we are presenting example scenarios of the VRE usage in the four different Earth Science domains.

  • Jerzy Proficz and Krzysztof Drypczewski Processing of Satellite Data in the Cloud

    The dynamic development of digital technologies, especially those dedicated to devices generating large data streams, such as all kinds of measurement equipment (temperature and humidity sensors, cameras, radio-telescopes and satellites – Internet of Things) enables more in-depth analysis of the surrounding reality, including better understanding of various natural phenomenon, starting from atomic level reactions, through macroscopic processes (e.g. meteorology) to observation of the Earth and the outer space. On the other hand such a large quantitative improvement requires a great number of processing and storage resources, resulting in the recent rapid development of Big Data technologies. Since 2015, the European Space Agency (ESA) has been providing a great amount of data gathered by exploratory equipment: a collection of Sentinel satellites – which perform Earth observation using various measurement techniques. For example Sentinel-2 provides a stream of digital photos, including images of the Baltic Sea and the whole territory of Poland. This data is used in an experimental installation of a Big Data processing system based on the open source software at the Academic Computer Center in Gdansk. The center has one of the most powerful supercomputers in Poland – the Tryton computing cluster, consisting of 1600 nodes interconnected by a fast Infiniband network (56 Gbps) and over 6 PB of storage. Some of these nodes are used as a computational cloud supervised by an OpenStack platform, where the Sentinel-2 data is processed. A subsystem of the automatic, perpetual data download to object storage (based on Swift) is deployed, the required software libraries for the image processing are configured and the Apache Spark cluster has been set up. The above system enables gathering and analysis of the recorded satellite images and the associated metadata, benefiting from the parallel computation mechanisms. This paper describes the above solution including its technical aspects.

  • Zofia Kasprzak, Mariusz Polarczyk and Krzysztof Gmerek Quantitative and qualitative development of natural and agricultural science resources in the AGRO database (2009–2016)

    The types of natural and agricultural science resources contained in the AGRO database have been characterized and their dynamic development in qualitative and quantitative terms in 2009–2016 has been described. In addition, types of database records are presented, with justification for their differentiation by: records containing only a bibliographic description of the article, bibliographic records along with authors' affiliations, records with, in addition to the aforementioned elements, summaries and attachment bibliographies and records of the highest information value, most frequently searched by database users, in other words records containing full texts of articles. Furthermore, the database recipients and their information and search preferences based on surveys are defined. The use of AGRO in Poland and abroad is considered based on selected statistical data. The AGRO database development plans are discussed depending on the acquisition of funds for its maintenance and quantitative development and the multiplication of records with full texts.

  • Dalia Chrzanowska, Rafał Niemiec and Zdzisław Hippe Research on Melanocitic Skin Lesion Infobase Enlargement – New Facts and Concepts

    The melanocytic skin lesion infobase, available at http://synthesis.melanoma.pl (also http://synteza.melanoma.pl, in Polish; referred to as INP) is currently undergoing a complete modification of the way in which (i) the internal synthesis algorithms and (ii) the classification of lesions are performed. We investigated 29 new real images of melanocytic skin lesions, focusing on how humans perform classification based on experience. In conclusion we suggest to add a new color – connected with the depth of a lesion – to the K term of Asymmetry (A), Border (B) and linear Combination of colors and structures (K) method (referred to as ABK). The types of natural and agricultural science resources contained in the AGRO database have been characterized and their dynamic development in qualitative and quantitative terms in 2009–2016 has been described. In addition, types of database records are presented, with justification for their differentiation by: records containing only a bibliographic description of the article, bibliographic records along with authors' affiliations, records with, in addition to the aforementioned elements, summaries and attachment bibliographies and records of the highest information value, most frequently searched by database users, in other words records containing full texts of articles. Furthermore, the database recipients and their information and search preferences based on surveys are defined. The use of AGRO in Poland and abroad is considered based on selected statistical data. The AGRO database development plans are discussed depending on the acquisition of funds for its maintenance and quantitative development and the multiplication of records with full texts.