放假两周,基本是这么打算的:
1、看看m5是怎么写的,一是学习一下别人模拟器,二是看看能不能把它并行。
2、Applications of Parallel Computers。学其中的几讲。
3、还有一些电子书,粗读一下增宽点知识面。
Petascale Computing: Algorithms and Applications第一章笔记:
计算科学已经成为科学重要的一个支柱。
好消息是,不管多少计算能力,都能用起来。
坏消息是,不管你有多少计算能力,都不够。
提高程序的可扩展性是关键。
" An important force which has continued to drive HPC has been a commu-nity articulation of “frontier milestones,” i.e., technical goals which symbolize the next level of progress within the field."
" Currently, few real existing HPC codes easily scale to this regime, and major code development efforts are critical to achieve the poten-tial of the new petaflop systems."
前面十年高性能计算机的计算能力主要依靠CPU的频率加快。下一代超级计算机的性能主要依靠高性能互联和不同层次的整合。
有个问题:这5台机器,为什么有的用3D-torus,有的用fat-tree呢?
BG/L很厉害,用了三种网络:
The BG/L nodes are connected via five different networks, including a torus,
collective tree, and global interrupt tree. The 3D-torus interconnect is used
for general-purpose point-to-point message-passing operations using 6 inde-
pendent point-to-point serial links to the 6 nearest neighbors that operate at
175MB/s per direction (bidirectional) for an aggregate bandwidth of 2.1GB/s
per node. The global tree collective network is designed for high-bandwidth
broadcast operations (one-to-all) using three links that operate at a peak
bandwidth of 350MB/s per direction for an aggregate 2.1GB/s bidirectional
bandwidth per node. Finally, the global interrupt network provides fast bar-
riers and interrupts with a system-wide constant latency of ≈ 1.5μs.
其中一篇用到的工具:IPM,可以用来得到负载的性能,对于我现在做的工作有帮助。
Integrated Performance Monitoring: Understanding Applications and Workloads
可以得到MPI的调用和通信信息。
What sort of interconnect does your workload need?这个问题是一个好问题。
1 条评论:
Nice brief and this post helped me alot in my college assignement. Gratefulness you as your information.
发表评论