![]() Old snapshot of IDI's research cluster "ClustIs" |
Our stand at NOTUR 2003, Oslo |
This Webpage is dedicated to the accomplishments within the subproject Cluster Technologies.
Cluster technologies are here defined as the technologies that enable a group of independent computers (e.g. PCs, work stations, SMPs) to work together as single distributed memory systems. Since traditional special-purpose hardware for compute servers is generally considered much more expensive than building and scaling up cluster systems that use general purpose parts (with comparable amounts of memory and CPU power), cluster systems may be attractive as potential compute servers for future high performance (HPC) applications.
Some general issues considered:
Evaluation of new algorithms and methods with respect to future resources as well as numerical testing of generic operations, were included. We also looked at cluster related tools, including furthering our current work on execution monitoring for clusters. Security, stability and operational cost issues was discussed.
Based on our findings, we sought cooperation with relevant cluster activities in Norway and elsewhere where appropriate regarding, for instance, exchange of computer resources to get more diverse test beds. (E.g. NTNU exchanged cycles with Linköping). Ties to the Grid Technology program were also established.
The project leader reported to the project leader of NOTUR.
Budget: NOK 100.000 for summer student support.
Paul Sack, a former undergraduate student of Elster at The University of Texas at Austin, ported a physics code to a cluster during the summer of 2002 as part of the precursor to this project. (His work was financed through the Computing Center (ITEA) at NTNU). His contribution led to a talk and a report that highlights some of the difficulties associated with porting such code to cluster systems:
This activity was an extension of this work.
The applications were selected due to their interest in the application community and access to code authors.
Elster and her student Åsmund Østvold looked at porting Protomol, a molecular dynamics code currently running on our HPC systems, parallelized by colleagues in Bergen. Several challenges were uncovered, including the difficulty in porting a code that uses one-sided communication routines such as MPI put and get that relies on DMA (direct memory access) features, not present on cluster (at least not yet). Timing issues were also uncovered. Details of our findings can be found in Østvold's report "Porting of Protomol from SMP to a Computational Cluster". The report is in Norwegian only.
See links to cluster project documents and reports.
Elster and her students Snorre Boasson and Jan Christian Meyer also looked at a PIC (Particle-in-Cell) code, an electrostatic code that Elster wrote for an SMP machine that was rewritten using MPI for clusters as part of this project.
See links to cluster project documents and reports.
A preliminary report was presented at our stand at NOTUR 2003 in Oslo May 14-15, 2003.
The project expected that UiT staff have and/or will be porting the following applications from their current SMP system (Athelon) to their Itanium cluster as part of their current efforts on their cluster:
Due to staff time-constraints this activity limited itself to focusing on porting Amber as part of this project and a summary given in Norwegian below (no report in English provided).
Gaussian proved to be a mistake (see below), and Dalton was basically covered by A1.2b.
Results summarized in Norwegian by the activity leader:
Som sitert i mailen du sendte så ble prosjektet fakturert for 120 tiner, til sammen kr. 43 200.- i 2003. Av dette var 80 timer for initiell porting, optimalisering og testing av Amber. 40 timer var for videre testign av Amber, der vi benyttet Scalis MPI (Scampi).
Kort oppsummert så er var resultatene fra del 1 at Amber er en kode som er svært godt egnet til å kjøre på Snowstorm, så godt egnet at det vel nå ikke er noen som har fått kjøretid på Nana til Amber.
Kort oppsummert så var resultatene av del 2 at Amber faktisk kjører 10-25 % raskere ved bruk av Scampi enn standard MPI.
De prosentvise forbedringne osv har jeg ikke i hodet, men de vil jeg ikke plage Roy med å hente ut før etter workshopen :-) Men vi har f.eks. sett, uavhenig av hvilken MPI vi brukte, at en 4 CPUers Amber generelt kjører raskere på 2 stk. 2 CPU maskiner enn på en 4 CPU maskin. Men graden av dette varierer naturligvis med hvilke typer jobber som kjøres (det er så mange forskjellige typer jobber man kan kjøre med Amber at vi på langt nær har vært innom alle, vi fokuserte på noen typer som er mye brukt hos oss). Grunnen til dette er at forskjellige typer jobber utnytter bedre karakteristikkene til infrastrukturen i 2-vegs maskinene i forhold til 4-vegs maskinene. Typisk så trives I/O intensive jobber bedre på 2-vegs maskinene enn på 4-vegs maskinene.
Når det gjelder Gaussian så var det rett og slett en tabbe å ta den med på listen: standard Gaussien kjører kun i parallel på SMPer. Skal man ha en klyngebasert versjon så må man også ha Linda = $$$$. Linda er rett og slett for dyrt til at vi mener at innkjøp kan forsvares (men Morten Hanshugen i Oslo nevnte faktisk rett før påske at prisen var i ferd med å synke).
Derfor trodde vi at vi var lure når vi byttet om til den koden som kanskje er den nest mest brukte kjemikoden i Norge: ADF. Problemet er bare at denne har vi enda ikke fått til å kjøre enda på Itanium... Vi har lagt ned en god del arbeid i portingen, og stadig kommet lengre. Men vi har faktisk ikke en kjørende versjon den dag i dag. Dette er veldig synd for dette er en kode som i teorien skal fungere bra på klynger.... Uansett, den jobben vi har gjort på ADF har vi tatt på Metasenter delen av prosjektet, da selve portingen er noe vi må gjøre uansett (for Amber var selve portingen til en kjørende kode en minimal jobb, der var det optimalisering og testing det ble brukt tid på).
This original goal of this activity was to include:
A preliminary report was presented as a poster at our stand at NOTUR 2003. The final report was to include more in-depth analyses, including a total cost analysis of running a cluster system vs. a large SMP. This is highlighted in the Norwegian summary above.
Budget: NOK 150.000 for post doc/students.
This activity included using state-of-art optimization techniques for a port of a popular application to a compute cluster. We selected Dalton for this effort since we ere able to work directly with this Norwegian vendor. Dalton is also a competitor to Gaussian, a very popular user application at all Norwegian HPC sites.
This activity commenced in spring 2003 with main results made available by NOTUR 2003.
This activity lead to the following activities and reports, including time estimates for Otto Anshus (OA) and John Markus Bjørndalen (JMB):
See links to cluster project documents and reports.
Participants: Tore Larsen, Otto Anshus and students (Computer Science/UiT)
Budget: NOK 100.000 for student support.
This activity extended the current work of the Distributed Systems Group at UiT on execution monitoring and tools for clusters. These efforts include a special focus on applicability to future NOTUR activities. A survey of current technologies in the field is included. The activity also included an analysis of what may be necessary for using this technology as a compute server for a display wall.
This activity led to the following activities and reports, including time estimates for Otto Anshus (OA) and John Markus Bjørndalen (JMB):
See links to cluster project documents and reports .
Budget: NOK 100.000 for student support This activity looked at how suitable a specialized cluster may be as a compute engine for visualization and other related applications.
This activity commenced in January 2003 and ran throughout the project.
A short summary follows in Norwegian. Also ee Vik's report re. Chromium vs. SGI visualization hardware listed at the end of this report.
To forskjellige typer bruk av cluster: * off-line (ikke sanntids rendering). Dette er ofte såkalte "renderingfarms" med en drøss med maskiner som alle jobber på hver sin frame av en større animasjon. Typisk brukt i filmindustri og alle andre områder der man ikke trenger interaktivitet og/eller sanntids oppdatering. Alle større 3D modelleringsprogrammer som Lightwave, 3DStudio, Maya har funksjonalitet for dette. * on-line (eller realtime). Mest interessant fra et teknologisk synspunkt. Resten av teksten handler om denne.
Cluster brukes innenfor interaktiv visualiseringsprogramvare for å øke ytelsen, for å muliggjøre større datasett, for å unngå begrensninger i lokal hardware. De fleste visualiseringscluster fungerer prinsipielt ved at en bruker sitter på en klientmaskin som i seg selv ikke har noe særlig kapasitet. Clusteret tar seg av all beregning og sender bare de ferdige bildene til klienten. Klientmaskinen sørger også for å ta imot input fra bruker og sende disse til cluster. Datasett for slik visualisering er ofte svært store, og, avhengig av situasjonen, brukes både polygonbasert og voxelbasert rendering.
Hovedproblemet med å få clusters brukbare innenfor interaktive visualiseringsprogram er forsinkelser pga nettverk. Dette er som oftest det verste problemet. Dette løses ved å redusere tiden som brukes for å overføre bilder mellom cluster og klient. Det kan enten løses ved å redusere datamengden (komprimeringsmetoder) eller øke nettverksytelsen. Eller begge.
Parallelitet i selve clusteret baseres på uavhengighetsforhold mellom forskjellige data. Det kan være uavhengigheter mellom forskjellige deler i samme datasett, eller det kan være uavhengigheter mellom forskjellige frames i et 4D datasett. Load-balancing blir ofte et problem i slike sammenhenger og er et viktig forskningsområde. Hvordan metode som brukes for load-balancing er som oftest svært kontekstavhengig.
See also links to cluster project documents and reports .
Budget: NOK 100.000 for student support. NOTE: Only around NOK 50K was used in this project due to lack of students available. The remaining funds were transferred to activity A1.
This activity's goal was to evaluate the impact of future HPC technologies on some selected numerical algorithms and computational strategies. The examples included an evaluation of higher order methods for the numerical solution of partial differential equations. A recently proposed novel computational approach based on parallelization in time of numerical algorithms was also evaluated. Some of the activity will include numerical tests of generic operations.
This activity was preformed during the summer of 2003.
The results from this work is given in the following report: "The Parareal Algorithm -- A survey of present work"
Budget: NOK 50.000
This activity focused on collaborating efforts with the ET-Grid project. A testgrid using a local was set up at NTNU as part of the GRID.
We here looked at how our results impact current Grid efforts. In particular, we wanted to look at heterogeneous clusters since many of the performance issues with such clusters will relate strongly to applications spread over a computational grid.
Participant: Anne C. Elster (Computer Science/NTNU)
This activity included all administration and coordination of the project, including status report and the final report.
This activity commenced immediately and continued throughout the project. The activities included among others:
In addition the project participants attended several national and international meetings promoting NOTUR and this project as well as increasing our own knowledge of the field significantly.
Overall the project provided at lot of insights related to cluster computing as an emerging HPC technology as ca be seen from the many conclusions and reports provided in this document.
It is an active and evolving area for HPC that Norway needs to stay on top of in the future. We are therefore pleased to see that this project is continued as part of the NOTUR 2004 Competency projects.
![]() |
Otto Anshus -- Associate Professor in CS (IFI) at Univ. of Tromsø
Worked a lot on activities A1.2.b and A2 that produced several reports. |
|
|
![]() |
John Markus Bjørndalen -- Associate Professor, CS (IFI) Univ. of Tromsø
Worked a lot on activities A1.2.b and A2 that produced several reports. |
|
|
|
Snorre Boasson -- MS Student, IDI, NTNU Mr. Boasson worked part-time for the NTNU CSE project as a web designer before joining the NOTUR ET Cluster project. He worked with Jan Christian Meyer on A1.1 looking at PIC codes. See links to cluster project documents and reports. He is currently working for the NOTUR ET 2004 project on continuing his Master's work on testing large application w.r.t. I/O handling. |
|
|
![]() |
Lars Ailo Bongo -- Graduate student in CS (IFI) at Univ. of Tromsø
Worked a lot on activities A1.2.b and A2 with Anshus and Bjørndalen that produced several reports. See links to cluster project documents and reports. |
|
|
![]() |
Roy Dragseth -- Computing Center Staff, Univ. of Tromsø
Mr. Dragseth is the national leader of the current (2004) Cluster competency project, the follow-up subproject related to this one. He was part of the UiT CC team that worked on part A1.2a -- Profiling and user analysis of Amber, a well-known molecular dynamics code. See A1.2a Results. |
|
|
![]() |
Anne C. Elster -- CLUSTER PROJECT LEADER and Associate Professor, IDI, NTNU
As can be seen from the list of people involved, Dr. Elster got several of her colleagues and former students involved in this project. She is responsible for this web page/final report. |
|
|
![]() |
Tor Johansen -- Computing Center Staff, Univ. of Tromsø
Mr. Johansen was the leader of the UiT CC team that worked on part A1.2a -- Profiling and user analysis of Amber, a well-known molecular dynamics code. See A1.2a Results. |
|
|
![]() |
Tore Larsen -- CS (IFI) Univ. of Tromsø Mr. Larsen was working closely with Prof. Anshus, as well as helped coordinate with the Grid project. He participated as part of the project in several meetings, including the Norwegian initial Grid organizational meeting at UNINETT in Trondheim as well as NOTUR 2003 in Oslo. |
|
|
|
Jan Christian Meyer -- MS Student, IDI, NTNU He was during fall 2003 doing work for the NOTUR Cluster project working on PIC codes, and is currently working on the NOTUR 2004 Competency Project at ITEA. |
|
|
![]() |
Einar Rønquist -- Professor, IMF, NTNU |
|
|
He now has a new hair cut: |
Paul Sack -- Temporary programmer/summer student, ITEA, NTNU
Mr. Sack was working on HPC Benchmarking at NTNU sponsored by ITEA (May 27 through Sept. 20, 2002) as part of the pre-cursor to this project. His work falls under A1.1 -- Physics and Chemistry (Protomol and PIC codes), and produced the following PDF documents: After working at NTNU, he joined the MS/PhD program in C.S. at the Univ. of Illinois, Urbana-Champaign in January 2003. |
|
|
![]() |
Gunnar Staff -- M.S. student, IMF, NTNU Mr. Staff joined the Simula Research Lab in Oslo, fall 2003. |
|
|
![]() |
Steinar Trædal-Henden -- Computing Center Staff, Univ. of Tromsø Mr. Trædal-Henden worked on the UiT CC team that worked on part A1.2a -- Profiling and user analysis of Amber, a well-known molecular dynamics code. See A1.2a Results. |
|
|
|
Torbjørn Vik -- MS Student, IDI, NTNU Mr. Vik was co-supervised with Torbjørn Hallgren (Elster was main advisor). He graduated with his M.S. degree in June 2003 and continued working with Dr. Elster on the Cluster project on A3: Visualization servers, etc. on a part-time basis as he explored his more artistic sides in the movie/graphics business during the fall 2003 semester. He joined Schlumberger's visualization group in Oslo starting January 2004. |
|
|
|
Åsmund Østvold -- MS Student, IDI, NTNU Mr. Østvold's Master's work was co-supervised by Dr. Elster and Håkon Bugge, SCALI , a high-performance clustering vendor. Mr. Østvold graduated with his M.S. degree in July 2003, and received a fellowship from the University of Minnesota for the 2003/2004 school year. Mr. Østvold worked on the Cluster Project during fall 2002 on A1.1 where he benchmarked Protomol, a molecular dynamics code running on our HPC systems, and parallelized for SMPs by colleagues in Bergen. See links to cluster project documents and reports. |
|
|
Elster helped raise NOK 1 million at NTNU for these projects,
which made NTNU the largest partner. These funds are now matched
by the Research Council of Norway (RCN) which also adds in
NOK 450K for expanded storage HW.
The reports are listed according to their associated activity number (A1-A6)
Note that overview and status presentations are found under A6.
A1: Profiling and tuning of selected applications
A1.1: Physics and Chemistry codes (Protomol and PIC)
A1.2a: Profiling and user analysis of Amber, Dalton and Gaussian
-- See summary in A.1.2a project description above
A1.2b:
A2: Execution monitoring
A3:Visualization servers, etc.
A4: Impact of future numerical algorithms and methods
A5: Interface with Grid project
-- See project description summary.
A6: Administration -- Overview and status presentations
Back to Anne C. Elster's Home Page
This page is maintained by :
elster-at-idi.ntnu.no
It was last updated On November 18, 2004. Comments welcome.
Part of NTNU's gang on March 25, 2004
The Cluster and Grid ET projects showed their success in
that an expanded follow-up project was created.
This 2004 project is now a joint effort by NTNU, Univ. of Bergen, Univ. of Oslo,
Univ. of Tromsø and UNINETT. Statoil will also be participating.
See
NOTUR 2004 Competency Project -- GRID, Cluster and Storage
for details re. NTNU's involvement in this follow-up project.
Links to cluster project documents and reports