The post RISC2’s partners gather in Brussels to reflect on three years of collaboration between EU and Latin America first appeared on RISC2 Project.
]]>The session began with a welcome and introduction by Mateo Valero (BSC), one of the main drivers of this cooperation and a leading name in the field of HPC. This intervention was later complemented by Fabrizio Gagliardi (BSC). Afterward, Elsa Carvalho (INESC TEC) presented the work done in terms of communication by the RISC2 team, an important segment for all the news and achievements to reach all the partners and countries involved.
Carlos J. Barrios Hernandez then presented the work done within the HPC Observatory, a relevant source of information that European and Latin American research organizations can address with HPC and/or AI issues.
The session closed with an important and pertinent debate on how to strengthen cooperation in HPC between the European Union and Latin America, in which all participants contributed and gave their opinion, committing to efforts so that the work developed within the framework of RISC2 is continued.
What our partners had to say about the meeting?
Rafael Mayo Garcia, CIEMAT:
“The policy event organized by RISC2 in Brussels was of utmost importance for the development of HPC and digital capabilities for a shared infrastructure between EU and LAC. Even more, it has had crucial contributions to international entities such as CYTED, the Ibero-American Programme for the Development of Science and Technology. On the CIEMAT side, it has been a new step beyond for building and participating in a HPC shared ecosystem.”
Esteban Meneses, CeNAT:
“In Costa Rica, CeNAT plays a critical role in fostering technological change. To achieve that goal, it is fundamental to synchronize our efforts with other key players, particularly government institutions. The event policy in Brussels was a great opportunity to get closer to our science and technology ministry and start a dialogue on the importance of HPC, data science, and artificial intelligence for bringing about the societal changes we aim for.”
Esteban Mocskos, UBA:
“The Policy Event recently held in Brussels and organized by the RISC2 project had several remarkable points. The gathering of experts in HPC research and management in Latin America and Europe served to plan the next steps in the joint endeavor to deepen the collaboration in this field. The advance in management policies, application optimization, and user engagement are fundamental topics treated during the main sessions and also during the point-to-point talks in every corner of the meeting room.
I can say that this meeting will also spawn different paths in these collaboration efforts that we’ll surely see their results during the following years with a positive impact on both sides of this fruitful relationship: Latin America and Europe.”
Sergio Nesmachnow, Universidad de la República:
“The National Supercomputing Center (Uruguay) and Universidad de la República have led the development of HPC strategies and technologies and their application to relevant problems in Uruguay. Specific meetings such as the policy event organized by RISC2 in Brussels are key to present and disseminate the current developments and achievements to relevant political and technological leaders in our country, so that they gain knowledge about the usefulness of HPC technologies and infrastructure to foster the development of national scientific research in capital areas such as sustainability, energy, and social development. It was very important to present the network of collaborators in Latin America and Europe and to show the involvement of institutional and government agencies.
Within the contacts and talks during the organization of the meeting, we introduced the projecto to national authorities, including the National Director of Science and Technology, Ministry of Education and Culture, and the President of the National Agency for Research and Innovation, as well as the Uruguayan Agency for International Cooperation and academic authorities from all institutions involved in the National Supercomputing Center initiative. We hope the established contacts can result in productive joint efforts to foster the development of HPC and related scientific areas in our country and the region.”
Carla Osthoff, LNCC:
“In Brazil, LNCC is critical in providing High Performance Computing Resources for the Research Community and training Human Resources and fostering new technologies. The policy event organized by RISC2 in Brussels was fundamental to synchronizing LNCC efforts with other government institutions and international entities. On the LNCC side, it has been a new step beyond building and participating in an HPC-shared ecosystem.
Specific meetings such as the policy event organized by RISC2 in Brussels were very important to present the network of collaborators in Latin America and Europe and to show the involvement of institutional and government agencies.
As a result of joint activities in research and development in the areas of information and communication technologies (ICT), artificial intelligence, applied mathematics, and computational modelling, with emphasis on the areas of scientific computing and data science, a Memorandum of Understanding (MoU) have been signed between LNCC and Inria/France. As a result of new joint activities, LNCC and INESC TEC/Portugal are starting collaboration through INESC TEC International Visiting Researcher Programme 2023.”
The post RISC2’s partners gather in Brussels to reflect on three years of collaboration between EU and Latin America first appeared on RISC2 Project.
]]>The post Scientific Machine Learning and HPC first appeared on RISC2 Project.
]]>As the coordinator of the High Performance Computing Center (Nacad) at COPPE/UFRJ, Alvaro Coutinho, presented advances in AI in Engineering and the importance of multidisciplinary research networks to address current issues in Scientific Machine Learning. Alvaro took the opportunity to highlight the need for Brazil to invest in high performance computing capacity.
The country’s sovereignty needs autonomy in producing ML advances, which depends on HPC support at the Universities and Research Centers. Brazil has nine machines in the Top 500 list of the most powerful computer systems in the world, but almost all at Petrobras company, and Universities need much more. ML is well-known to require HPC, when combined to scientific computer simulations it becomes essential.
The conventional notion of ML involves training an algorithm to automatically discover patterns, signals, or structures that may be hidden in huge databases and whose exact nature is unknown and therefore cannot be explicitly programmed. This field may face two major drawbacks: the need for a significant volume of (labelled) expensive to acquire data and limitations for extrapolating (making predictions beyond scenarios contained in the trained data difficult).
Considering that an algorithm’s predictive ability is a learning skill, current challenges must be addressed to improve the analytical and predictive capacity of Scientific ML algorithms, for example, to maximize its impact in applications of renewable energy. References [1-5] illustrate recent advances in Scientific Machine Learning in different areas of engineering and computer science.
References:
[1] Baker, Nathan, Steven L. Brunton, J. Nathan Kutz, Krithika Manohar, Aleksandr Y. Aravkin, Kristi Morgansen, Jennifer Klemisch, Nicholas Goebel, James Buttrick, Jeffrey Poskin, Agnes Blom-Schieber, Thomas Hogan, Darren McDonaldAlexander, Frank, Bremer, Timo, Hagberg, Aric, Kevrekidis, Yannis, Najm, Habib, Parashar, Manish, Patra, Abani, Sethian, James, Wild, Stefan, Willcox, Karen, and Lee, Steven. Workshop Report on Basic Research Needs for Scientific Machine Learning: Core Technologies for Artificial Intelligence. United States: N. p., 2019. Web. doi:10.2172/1478744.
[2] Brunton, Steven L., Bernd R. Noack, and Petros Koumoutsakos. “Machine learning for fluid mechanics.” Annual Review of Fluid Mechanics 52 (2020): 477-508.
[3] Karniadakis, George Em, et al. “Physics-informed machine learning.” Nature Reviews Physics 3.6 (2021): 422-440.
[4] Inria White Book on Artificial Intelligence: Current challenges and Inria’s engagement, 2nd edition, 2021. URL: https://www.inria.fr/en/white-paper-inria-artificial-intelligence
[5] Silva, Romulo, Umair bin Waheed, Alvaro Coutinho, and George Em Karniadakis. “Improving PINN-based Seismic Tomography by Respecting Physical Causality.” In AGU Fall Meeting Abstracts, vol. 2022, pp. S11C-09. 2022.
The post Scientific Machine Learning and HPC first appeared on RISC2 Project.
]]>The post Latin American researchers present greener gateways for Big Data in INRIA Brazil Workshop first appeared on RISC2 Project.
]]>The goal of the investigation is to provide users with simplified access to computing structures through scientific solutions that represent significant developments in their fields. In the case of this project, it is intended to develop intelligent green scientific solutions for BioinfoPortal (a multiuser Brazilian infrastructure)supported by High-Performance Computing environments.
Technologically, it includes areas such as scientific workflows, data mining, machine learning, and deep learning. The outlook, in case of success, is the analysis and interpretation of Big Data allowing new paths in molecular biology, genetics, biomedicine, and health— so it becomes necessary tools capable of digesting the amount of information, efficiently, which can come.
The team performed several large-scale bioinformatics experiments that are considered to be computationally intensive. Currently, artificial intelligence is being used to generate models to analyze computational and bioinformatics metadata to understand how automatic learning can predict computational resources efficiently. The workshop was held from April 10th to 11th, and took place in the University of Sao Paulo.
RISC2 Project, which aims to explore the HPC impact in the economies of Latin America and Europe, relies on the interaction between researchers and policymakers in both regions. It also includes 16 academic partners such as the University of Buenos Aires, National Laboratory for High Performance Computing of Chile, Julich Supercomputing Centre, Barcelona Supercomputing Center (the leader of the consortium), among others.
The post Latin American researchers present greener gateways for Big Data in INRIA Brazil Workshop first appeared on RISC2 Project.
]]>The post Developing Efficient Scientific Gateways for Bioinformatics in Supercomputer Environments Supported by Artificial Intelligence first appeared on RISC2 Project.
]]>By:
Carneiro, B. Fagundes, C. Osthoff, G. Freire, K. Ocaña, L. Cruz, L. Gadelha, M. Coelho, M. Galheigo, and R. Terra are with the National Laboratory of Scientific Computing, Rio de Janeiro, Brazil.
Carvalho is with the Federal Center for Technological Education Celso Suckow da Fonseca, Rio de Janeiro, Brazil.
Douglas Cardoso is with the Polytechnic Institute of Tomar, Portugal.
Boito and L, Teylo is with the University of Bordeaux, CNRS, Bordeaux INP, INRIA, LaBRI, Talence, France.
Navaux is with the Informatics Institute, the Federal University of Rio Grande do Sul, and Rio Grande do Sul, Brazil.
References:
Ocaña, K. A. C. S.; Galheigo, M.; Osthoff, C.; Gadelha, L. M. R.; Porto, F.; Gomes, A. T. A.; Oliveira, D.; Vasconcelos, A. T. BioinfoPortal: A scientific gateway for integrating bioinformatics applications on the Brazilian national high-performance computing network. Future Generation Computer Systems, v. 107, p. 192-214, 2020.
Mondelli, M. L.; Magalhães, T.; Loss, G.; Wilde, M.; Foster, I.; Mattoso, M. L. Q.; Katz, D. S.; Barbosa, H. J. C.; Vasconcelos, A. T. R.; Ocaña, K. A. C. S; Gadelha, L. BioWorkbench: A High-Performance Framework for Managing and Analyzing Bioinformatics Experiments. PeerJ, v. 1, p. 1, 2018.
Coelho, M.; Freire, G.; Ocaña, K.; Osthoff, C.; Galheigo, M.; Carneiro, A. R.; Boito, F.; Navaux, P.; Cardoso, D. O. Desenvolvimento de um Framework de Aprendizado de Máquina no Apoio a Gateways Científicos Verdes, Inteligentes e Eficientes: BioinfoPortal como Caso de Estudo Brasileiro In: XXIII Simpósio em Sistemas Computacionais de Alto Desempenho – WSCAD 2022 (https://wscad.ufsc.br/), 2022.
Terra, R.; Ocaña, K.; Osthoff, C.; Cruz, L.; Boito, F.; Navaux, P.; Carvalho, D. Framework para a Construção de Redes Filogenéticas em Ambiente de Computação de Alto Desempenho. In: XXIII Simpósio em Sistemas Computacionais de Alto Desempenho – WSCAD 2022 (https://wscad.ufsc.br/), 2022.
Ocaña, K.; Cruz, L.; Coelho, M.; Terra, R.; Galheigo, M.; Carneiro, A.; Carvalho, D.; Gadelha, L.; Boito, F.; Navaux, P.; Osthoff, C. ParslRNA-Seq: an efficient and scalable RNAseq analysis workflow for studies of differentiated gene expression. In: Latin America High-Performance Computing Conference (CARLA), 2022, Rio Grande do Sul, Brazil. Proceedings of the Latin American High-Performance Computing Conference – CARLA 2022 (http://www.carla22.org/), 2022.
[1] https://bioinfo.lncc.br/
[2] https://git.tecgraf.puc-rio.br/csbase-dev/csgrid/-/tree/CSGRID-2.3-LNCC
[3] https://https://sdumont.lncc.br
The post Developing Efficient Scientific Gateways for Bioinformatics in Supercomputer Environments Supported by Artificial Intelligence first appeared on RISC2 Project.
]]>The post Towards a greater HPC capacity in Latin America first appeared on RISC2 Project.
]]>A country that does not have the computational capacity to solve its own problems will have no alternative but to try to acquire solutions provided by others. One of the most important aspects of sovereignty in the 21st century is the ability to produce mathematical models and to have the capacity to solve them. Today, the availability of computing power commensurate with one’s wealth exponentially increases a country’s capacity to produce knowledge. in the developed world, it is estimated that for every dollar invested in supercomputing, the return to society is of the order of US$ 44(1) and to the academic world US$ 30(2). For these reasons, HPC occupies an important place on the political and diplomatic agendas of developed countries.
In Latin America, investment in HPC is very low compared to what’s the US, Asia and Europe are doing. In order to quantify this difference, we present the tables below, which show the accumulated computing capacity in the ranking of the 500 most powerful supercomputers in the world – the TOP500(3) – (Table 1), and the local reality (Table 2). Other data are also included, such as the population (in millions), the number of researchers per 1,000 inhabitants (Res/1000), the computing capacity per researcher (Gflops/Res) and the computing capacity per US$ million of GPD. In Table 1, we have grouped the countries by geographical area. America appears as the area with the highest computing capacity, essentially due to the USA, which has almost 45% of the world’s computing capacity in the TOP500. It if followed by Asia and then Europe. Tis TOP500 list includes mainly academic research centres, but also industry ones, typically those used in applied research (many private ones do not wish to publish such information for obvious reasons). For example, in Brazil – which shows good computing capacity with 88,175 TFlops – the vast majority is in the hands of the oil industry and only about 3,000 TFlops are used for basic research. Countries listed in the TOP500 invest in HPC from a few TFlops per million GDP (Belgium 5, Spain 7, Bulgaria 8), through countries investing in the order of hundreds (Italy 176, Japan 151, USA 138), to even thousands, as is the case in Finland with 1,478. For those countries where we were able to find data on the number of researchers, these range from a few Gflops per researcher (Belgium 19, Spain 24, Hungary 52) to close to 1,000 GFlops, i.e. 1 TFlop (USA 970, Italy 966), with Finland surpassing this barrier with 4,647. Note that, unlike what happens locally, countries with a certain degree of development invest every 3-4 years in supercomputing, so the data we are showing will soon be updated and there will be variations in the list. For example, this year a new supercomputer will come into operation in Spain(4), which, with an investment of some 150 million euros, will give Spain one of the most powerful supercomputers in Europe – and the world.
Country | Rpeak
(TFlops) |
Population
(millions) |
Res/1000 | GFlops/Res | Tflops/M US$ |
United States | 3.216.124 | 335 | 9.9 | 969.7 | 138.0 |
Canada | 71.911 | 39 | 8.8 | 209.5 | 40.0 |
Brazil | 88.175 | 216 | 1.1 | 371.1 | 51.9 |
AMERICA | 3.376.211 | 590 | |||
China | 1.132.071 | 1400 | 67.4 | ||
Japan | 815.667 | 124 | 10.0 | 657.8 | 151.0 |
South Korea | 128.264 | 52 | 16.6 | 148.6 | 71.3 |
Saudi Arabia | 98.982 | 35 | 141.4 | ||
Taiwan | 19.562 | 23 | 21.7 | ||
Singapore | 15.785 | 6 | 52.6 | ||
Thailand | 13.773 | 70 | 27.5 | ||
United Arab Emirates | 12.164 | 10 | 15.2 | ||
India | 12.082 | 1380 | 4.0 | ||
ASIA | 2.248.353 | 3100 | |||
Finland | 443.391 | 6 | 15.9 | 4647.7 | 1478.0 |
Italy | 370.262 | 59 | 6.5 | 965.5 | 176.3 |
Germany | 331.231 | 85 | 10.1 | 385.8 | 78.9 |
France | 251.166 | 65 | 11.4 | 339.0 | 83.7 |
Russia | 101.737 | 145 | 59.8 | ||
United Kingdom | 92.563 | 68 | 9.6 | 141.8 | 29.9 |
Netherlands | 56.740 | 18 | 10.6 | 297.4 | 56.7 |
Switzerland | 38.600 | 9 | 9.4 | 456.3 | 48.3 |
Sweden | 32.727 | 10 | 15.8 | 207.1 | 54.5 |
Ireland | 26.320 | 5 | 10.6 | 496.6 | 65.8 |
Luxembourg | 18.291 | 0.6 | 365.8 | ||
Poland | 17.099 | 38 | 7.6 | 59.2 | 28.5 |
Norway | 17.031 | 6 | 13.0 | 218.3 | 34.1 |
Czech Republic | 12.914 | 10 | 8.3 | 155.6 | 43.0 |
Spain | 10.296 | 47 | 7.4 | 29.6 | 7.4 |
Slovenia | 10.047 | 2 | 9.9 | 507.4 | 167.5 |
Austria | 6.809 | 9 | 11.6 | 65.2 | 13.6 |
Bulgaria | 5.942 | 6 | 8.5 | ||
Hungary | 4.669 | 10 | 9.0 | 51.9 | 23.3 |
Belgium | 3.094 | 12 | 13.6 | 19.0 | 5.2 |
EUROPA | 1.850.934 | 610.6 | |||
OTHER | |||||
Australia | 60.177 | 26 | 40.1 | ||
Morocco | 5.014 | 39 | 50.1 |
Table 1. HPC availability per researcher and relative to GDP in the TOP500 countries (includes HPC in industry).
The local reality is far from this data. Table 2 shows data from Argentina, Brazil, Chile and Mexico. In Chile, the availability of computing power is 2-3 times less per researcher than in countries with less computing power in the OECD and up to 100 times less than a researcher in the US. In Chile, our investment measured in TFlops per million US$ of GDP is 166 times less than in the US; with respect to European countries that invest less in HPC it is 9 times less, and with respect to the European average (including Finland) it is 80 times less, i.e. the difference is considerable. It is clear that we need to close this gap. An investment go about 5 million dollars in HPC infrastructure in the next 5 years would close this gap by a factor of almost 20 times our computational capacity. However, returning to the example of Spain, the supercomputer it will have this year will offer 23 times more computing power than at present and, therefore, we will only maintain our relative distance. If we do not invest, the dap will increase by at least 23 times and will end up being huge. Therefore, we do not only need a one-time investment, but we need to ensure a regular investment. Some neighbouring countries are already investing significantly in supercomputing. This is the case in Argentina, where they are investing 7 million dollars (2 million for the datacenter and 5 million to buy a new supercomputer), which will increase their current capacities by almost 40 times(5).
Country | Rpeak
(TFlops) |
Population (millions) | Res/1000 | GFlops/Res | Tflops/M US$ |
Brazil* | 3.000 | 216 | 1.1 | 12.6 | 1.8 |
Mexico | 2.200 | 130 | 1.2 | 14.1 | 1.8 |
Argentina | 400 | 45 | 1.2 | 7.4 | 0.8 |
Chile | 250 | 20 | 1.3 | 9.6 | 0.8 |
Table 2. HPC availability per researcher and relative to GDP in the region (*only HPC capacity in academia is considered in this table).
For the above reasons, we are working to convince the Chilean authorities that we must have greater funding and, more crucially, permanent state funding in HPC. In relation to this, on July 6 we signed a collaboration agreement between 44 institutions with the support of the Ministry of Science to work on the creation of the National Supercomputing Laboratory(6). The agreement recognised that supercomputers are a critical infrastructure for Chile’s development, that it is necessary to centralise the requirements/resources at the national level, obtain permanent funding from the State and create a new institutional framework to provide governance. In an unprecedented inter-institutional collaboration in Chile, the competition for HPC resources at the national level is eliminated ad the possibility of direct funding from the State is opened up without generating controversy.
Undoubtedly, supercomputing is a fundamental pillar for the development of any country, where increasing investment provides a strategic advantage, and in Latin America we should not be left behind.
By NLHPC
References
(1) Hyperion Research HPC Investments Bring High Returns
(5) https://www.hpcwire.com/2022/12/15/argentina-announces-new-supercomputer-for-national-science/
The post Towards a greater HPC capacity in Latin America first appeared on RISC2 Project.
]]>The post LNCC encourages the participation of girls and women in science and technology careers first appeared on RISC2 Project.
]]>LNCC, represented by Carla Osthoff, Kary Ocaña and Ana Karl, presented theoretical and practical courses in Bioinformatics, HPC, Mathematical and Chemical Computing.
The post LNCC encourages the participation of girls and women in science and technology careers first appeared on RISC2 Project.
]]>The post LNCC’s HPC Summer School provided sessions related to HPC to their community first appeared on RISC2 Project.
]]>The School aimed to provide mini-courses and talks related to programming on high-performance computers, such as parallel programming models, profiling tools, and libraries for developing optimized parallel algorithms for the SDumont user community and the high-performance computing programming community.
Due to the extensive territory of Brazil and the number of research projects, it is mandatory to provide regular HPC schools for the research community. According to Carla Osthoff, one of the organizers of this school, “SDumont is the only Brazilian supercomputer dedicated to the research community that is part of the TOP 500 list. The Brazilian Ministry of Science and Technology offers free access to all Brazilian research projects in the country and foreign collaborators. Currently, we have 238 research projects from 18 research areas. This edition of the School received 350 registrations, but we also provided online YouTube access to the community.”
The event happened remotely, and all the sessions are available on Youtube.
The post LNCC’s HPC Summer School provided sessions related to HPC to their community first appeared on RISC2 Project.
]]>The post Call for Proposals to Support High Performance Computing Centers FAPESP-MCTI-MCom-CGI.br first appeared on RISC2 Project.
]]>The Call for High Performance Computing (HPC) Centers aims to support the acquisition of high performance computing equipment that can provide computational infrastructure to conduct research in all areas of knowledge that are intensive in computing resources. The resources necessary for the development of the infrastructure of the facilities to receive the high performance computing equipment are considered to be the responsibility of the proponent institutions and constitute the required counterpart for the presentation of the proposal. In addition, proposers must demonstrate a proven track record as an HPC center.
This program has the nature of creating infrastructure and is not intended to provide conventional funding for research projects that will eventually take advantage of the infrastructure supported here, and the support for the realization of such research projects should be sought in the lines of funding for research.
A portion of the maintenance costs of the equipment to be purchased may be requested in this Call. However, it is expected that proposals submitted to this Call will also propose other ways to cover equipment maintenance costs. No funds may be requested to cover costs for the maintenance of the building infrastructure and support for computer equipment, such as air conditioning and the like, which should be covered by funds contributed by the proponent institutions or from other sources. Furthermore, the costs of salaries and other charges related to the support staff that this Call for Proposals foresees should be available for the operation of the center cannot be requested in the proposals submitted to this Call for Proposals and are the sole responsibility of the proposing institutions. The proponents may foresee, in their business plan, charging for the provision of the services, provided that some level of gratuity is offered to users from academic institutions.
WHO?
This Call is open to Education or Research Institutions from all over Brazil, consortium or not, to support 1 center in the state of São Paulo and 1 or 2 centers in other Brazilian states, in a total amount of up to R$ 100 million. The center based in São Paulo may receive, in this Call, resources of up to R$ 50 million, and must meet the demand for high performance computing services within the entire state of São Paulo. The centers located in other states may receive resources of up to R$ 25 million, in the case of non-consortium projects, or up to R$ 50 million in the case of a consortium of several institutions that meet the demand for high performance computing services nationwide.
This Call is launched in the scope of FAPESP’s Multiuser Equipment Program – EMU and has an infra-structural nature.
Know more about this call here.
The post Call for Proposals to Support High Performance Computing Centers FAPESP-MCTI-MCom-CGI.br first appeared on RISC2 Project.
]]>The post LNCC participated in the 19th Brazilian Science and Technology National Week first appeared on RISC2 Project.
]]>The talk “The Santos Dumont Supercomputer in the scenario of National and International Scientific Research” was presented on December 3, 2022, and is available here.
The post LNCC participated in the 19th Brazilian Science and Technology National Week first appeared on RISC2 Project.
]]>The post Managing Data and Machine Learning Models in HPC Applications first appeared on RISC2 Project.
]]>However, to realize the full potential of this synergy, ML models (or models for short) must be built, combined and ensembled, which can be very complex as there can be many models to select from. Furthermore, they should be shared and reused, in particular, in different execution environments such as HPC or Spark clusters.
To address this problem, we proposed Gypscie [Porto 2022, Zorrilla 2022], a new framework that supports the entire ML lifecycle and enables model reuse and import from other frameworks. The approach behind Gypscie is to combine several rich capabilities for model and data management, and model execution, which are typically provided by different tools, in a unique framework. Overall, Gypscie provides: a platform for supporting the complete model life-cycle, from model building to deployment, monitoring and policies enforcement; an environment for casual users to find ready-to-use models that best fit a particular prediction problem, an environment to optimize ML task scheduling and execution; an easy way for developers to benchmark their models against other competitive models and improve them; a central point of access to assess models’ compliance to policies and ethics and obtain and curate observational and predictive data; provenance information and model explainability. Finally, Gypscie interfaces with multiple execution environments to run ML tasks, e.g., an HPC system such as the Santos Dumont supercomputer at LNCC or a Spark cluster.
Gypscie comes with SAVIME [Silva 2020], a multidimensional array in-memory database system for importing, storing and querying model (tensor) data. The SAVIME open-source system has been developed to support analytical queries over scientific data. Its offers an extremely efficient ingestion procedure, which practically eliminates the waiting time to analyze incoming data. It also supports dense and sparse arrays and non-integer dimension indexing. It offers a functional query language processed by a query optimiser that generates efficient query execution plans.
[Porto 2022] Fabio Porto, Patrick Valduriez: Data and Machine Learning Model Management with Gypscie. CARLA 2022 – Workshop on HPC and Data Sciences meet Scientific Computing, SCALAC, Sep 2022, Porto Alegre, Brazil. pp.1-2.
[Zorrilla 2022] Rocío Zorrilla, Eduardo Ogasawara, Patrick Valduriez, Fabio Porto: A Data-Driven Model Selection Approach to Spatio-Temporal Prediction. SBBD 2022 – Brazilian Symposium on Databases, SBBD, Sep 2022, Buzios, Brazil. pp.1-12.
[Silva 2020] A.C. Silva, H. Lourenço, D. Ramos, F. Porto, P. Valduriez. Savime: An Array DBMS for Simulation Analysis and Prediction. Journal of Information Data Management 11(3), 2020.
The post Managing Data and Machine Learning Models in HPC Applications first appeared on RISC2 Project.
]]>