1), (i) · 2018. 9. 25. · ? 9 • v $ tsqxoktwu •:*v • 1993 3& ; 0 ad+ 3 • 1995 3&...

67
$)%('&) (1), (I) #) !)"* +, 2018/9/25 $)%('&) (1), (I) 1

Upload: others

Post on 23-Aug-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

$�)%(�'&)�(1), (I)��#) �������!)"*

� � � ��

���������+�,������� �����

2018/9/25 $�)%(�'&)� (1), (I) 1

Page 2: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

�� (���

1. �����

2. ����

3. ����� �

4. ������

5. �(!' ��%()����*

2018/9/25 �"�(#'�&$(� (1), (I) 2

Page 3: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

@4W/-

•I�7<3W026TNR)L#]QZWJ��EW�9^5�O]1. ���.iwjxWhxmxdwnsxj *%2. ����%TMPIW�K�3. I�7<3�%

•�=�9W5�V[\J�@)W�CW02^"$VAZ]MT^/-TO]

•<31�_t`_wh?�@4zbkeuxD)• http://www.compsci-alliance.jp• �H_t`_whYW,D^yhttp://www.compsci-alliance.jp/�)�G/

• ;�^(QPX��>^+8• -U'�&BJ�FfwpgarSW+:• !���hmdwW_bawl+8

2018/9/25 hmdwovctqwc (1), (I) 3

Page 4: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

���(.40/�%(���#

•/1-625,436,798• �������7����8

•/1-625,436,7�8• �������� ���7����8

•���(��!��� (�� �+��%"*$

•���)!��!���(����%$• ��!���&+'!�(��� %$

2018/9/25 /1-625,436, (1), (I) 4

Page 5: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

?�9�• ��v� $�tSQX OKTWu

• :*v• 1993�3& � ;���0��AD+��3 (

• 1995�3& � ;�����C0��253<7)3��#��>4��

• 1998�3& � ;�����C0��253<7)3��#��>4��

• 1998�4& '��3����A����3 ?�

• 2002�4&�2007�3& '��3��^qimrc_Z[qa�A ?�

• 2007�4&�2008�11& 6,��<73�25bqcr 25�

• 2008�12&w2013�11& 6,��`ael����253%"

• 2013�12&w2015�11& '��� ���1bqcr .�%"

• 2015�12&w/� ���1bqcr %"

• 2016�2&�/� t�!u��C��8253D+8���#

• EGPU^qimreYq]FEah^q�Jgdfpr\FQPR�BN-@�s

• =KISE�$�FN]]VOGWGWMHUL

2018/9/25 ah^qjo]nkq] (1), (I) 5

Page 6: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

:3�-i��<;,)j

1. 9 25�(��)o LKSgR2. 10 2�

l � ��&F�"%0i��j

3. 10 9�oRWPg�'=�l fOKg�#BURVYfOb]�5

4. 10 16�l A�4YfOb\gO�$F�+li>�^_cBdhYJgfhcgOj (��)

5. 10 23�l A�4YfOb\gO�$F�+2iM`TQaXfTN�j

6. 10 30�l 5 -ZNVd.F� �

2018/9/25 RWPgYfOb\gO (1), (I)

7. 11 6�l HD�$F� �

8. 11 20�l 5 -5 .F� �ilj

9. 11 27�l 5 k5 .F� �imj

10. 12 4�l pq�7$ilj

l PgURV9@(6

11. 12 11�l pq�7$imj

12. 12 18�l pq�7$injB?�!;�

13. 1 8�l RB-HC8EB*/1��

e[hVCIGPgURV9@i2�o

2019�2 4�i j24� ��

6

Page 7: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

,���

1. �'D0<SR-0T+9a^d[H=C�l %��

l +;K+:MF,�81NR

l >LCI 0T+:�*JG7e+;G7MF�:>f

2. XcZY[-0T+:l �.eg�ji�e��ffJ4 ��H2�3e��f

l @QUE��>RPITB:SK4�'a^d[�H�

• gIO4hIO4gEhI��4I/�86R5

• gEhI��T(A?��J4�!4%��D�&T��>R5

• �)I'�8#"4$�"H[`V=CM=7

2018/9/25 Y\Xc]bW`_cW (1), (I) 7

Page 8: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

2016���1. 2016(�28)� S1S2

• ������/ 37�• )&,#�� �/ 22�

• ��/1���/13���/4���/3����� +"!#�� ��/3�-3�������.2. 2016(�28)� A1A2

• ������/ 9�• )&,#�� �/ 4�

• �/4����� +"!#�� ��/2�-2�������.3. 2017(�29)� S1S2

• ������/ 60�• )&,#�� �/ 40�

• ��: 3���/18���/13���/3����� +"!#�� ��/5�-3���������.4. 2017(�29)� A1A2

• ������/ 21�• )&,#�� �/ 11�

• �/6���: 1���: 3����� +"!#�� ��/1�-1�������.5. 2018(�30)� S1S2

• ������/ 37�• )&,#�� �/ 15�

• ��: 1���/5���: 8���: 0����� +"!#�� ��/5�-2�������.

2018/9/25 !$ +%*�('+� (1), (I) 8

Page 9: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

6*R!�

•r24mYFDSFortran24nNtvs^%?L�+•VN_OC;���R5:�1�M/0

• wusxR��afkc;�T;ej`ihR��C]BKL?P?N�F?<

• P\UD/0-^"[FI?

• wusxR��afkcC]BKL?P?�;ej`ihR��C]BKL?P?�S;��Q;9�&Q�A\Z@8�FXGRM;7�PD.?LDJE?<

• �^,�m3��#�Q�Fpol'�n

• HR) q����$S(50l �;VW�C=�><

2018/9/25 bdakej`igk` (1), (I) 9

Page 10: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

��/�4• ���#2���>L?/=MCM;LDHM?OOakforest-PACS=MCM;LDHM?<=@GP5��'0(

• 10�9�O�P/���.�687LA�-C=KMB5�,��(3��,( �)� '+$*&!

• ���.GPU9I=? (Reedbush-H)1�!0( �"%��N

2018/9/25 =C;LEJ:IFL: (1), (I) 10

Page 11: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

�������

•����$'* �������������*)+!������

l ��&�!�PDF#��(http://www.cspp.cc.u-tokyo.ac.jp/hanawa/class/

2018/9/25 �"�*$)�&%*� (1), (I) 11

Page 12: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

�/�• Mlmkrh+gF:Y`�,Se��`��b[N

• ��% K'#�&K9$�� 0

• "�(�KHJGI;=<F:@=?<B?@AA<KHJGI;=?F:ECD;@=?<B?@AACK*��s><=A�>�=D�K=CB7

• O �`)�P• lmkr`24�[WL��hSfdWU24VZQbWL

•lmkra�_�RgS•lmkra]i^�.c[K^X6U3-[Tg`S•���1K��`58\�!��K^]

2018/9/25 lmkrnqjporj (1), (I) 12

Page 13: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

�)�|#-�}• Uorlzsykvtzk�5~��'`JLIa�-~V

• $! �" .S

• �%�SIMFKCAB9>9;=9:@<>?=9>S(���<:;=�=�;<�S �DE?87<::6

• W��a&�X

• G02_/3• G02SHPQRQNOC:02anzsxsykvuZ��•��ixmwpubS�_gYfe][3�•�4,a��h�^jq{•��b�+T�d^����1*h�c��\a�5�

2018/9/25 orlzsykvtzk (1), (I) 13

Page 14: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

�&�} )�~• \��uzmxv{m�2�o{uyuzmxwc�gITORGJdITORBDD]

• !� �� +

• ���"�ZFKCH687A3;8:7=9;<=:ZFKCH68:A3@>?6;8:7=9;<=<Z% �� 978<�<�9<�

• ^��e#�_• D./ZESUWUMR@7./c-0• D./ZESUWUMR@7./e,�eo{uyuzmxw`���*}qk{z|r��~

•�1(e��j�bls|• LQRNSXV3JD )�*4DYPXQR$5[ptn{ci )�*[•��f'[hb��uzmxv{mj�g��ae�2�

2018/9/25 ptn{uzmxv{m (1), (I) 14

Page 15: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

�Q7• ���4 �D h b]��.R������ h�

• ^E*> <�32 D� OS�B=!A <�C��32 ��­D"® S�cE�@ / ��32��­D"® S�';9� J0�32 ��­("® S� �H:A¬5LU�1 S�

• �©�I�GT)85±ljkjijnimjd° � �± so° £«�4±lpl`�vxtw±rpqhnhmmrhjloqrhp° #�±mgrrj� ­;mgqjj�¯Koefd

• �;7�C-�• u|}~}z{XZ�W[• 4 �¨�§�¥��4+����%�[6• ;\P��$������9 �a�W?�FT��,W?�uuy��«������V�4 YM�¨�§�¥�� «

• �$��N²�N�&_�����4 YM�"������

2018/9/25 �¡�ª¢©�¦¤ª� (1), (I) 15

Page 16: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

�(�k\aZichYgdiY�6lM�%�#

• �2-M�9$.�VW`jBE3�• �(�J;<QS9�!V%;E��chYgdiYM$.�JBH�,G@RUH;T

• �=T��0+!>:TAJ> �

• ��M$.chYgeM�%LF;H1. ������&]i^jMFX10\jajZibfj^[\_eIDMOO�%CT

2. ')�MPCXg\^kMPI>�%I?TPMlI�%CT3. ���M��*M\jajZibfj^I�%CT

• �/MPCV%;H9kMPIINK;l4"chYgeI$.CTk�L4"chYgeM85�M17l

2018/9/25 \aZichYgdiY (1), (I) 16

Page 17: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

�������� �������

2018/9/25 ��������� (1), (I) 17

Page 18: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

�ª�ª�©�£ª���• �#LU.a��©�£ª����pkf��©�£ª����p• 2M��T��p

• C��6j§�¦�@O(U��|[O=�u�• RiH��kPC�®­­­jc�k®­­­� f�¡¢¥��|[O=

• m��A5����_3>n=>mb�_3PD�n«/�¬=>

mb�_3PD��WN�����A5��W�Z���}t^B��-V����K�n«$*28%117181R?E<K�¬�Z�!]���¦g�[O=

• N¯9N�h�°���¦g�[O=�o{~k�e6j(Us12.5���¤@O�`r���

• �ª�ª�©�£ª�"�,S ($*26%37311Gw��z)

• I dFQ� 3. u�,S�50TFLOPS��D\H6j(U�8y��ª�ª�©�£ª�ª�"��dFv��sku�!]Q��&X�'xYJyu��y�l

• C�k����y�~��ª�ª�©�£ª��� [O=• :���)��I�©�s+8y�Reedbush�ª�ª�©�£ª���� kOakforest-

PACS�ª�ª�©�£ª���� �k� [O=

2018/9/25 ���©�¨�¤�©� (1), (I) 18

�4kGPU 0;�`r~w�q!!

Page 19: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

S^PmB�. (1)��B:D<BSn^nPm_gnUC��("�r��BCPUK�%:H7>=/�&K��

• MPP (Massively Parallel Processor)• S^Pm� �r��,PRIMEHPC-FXRinT0Cray XCRinT@?• � B\nZ-[VYlnN(MmUP[NY): ��,TOFU, Cray Aries (�)�8I@2)

• NhSU• ��BQn]K[VYlnN=#�9<1RSWd9;FB• \nZ-[VYlnNrIntel OmniPath, InfiniBand@?PfXLWL'

• EthernetC1EG�JI@2��CoS^Pm>9<Cp$�A+2r• aNYj

• NEC SXRinT• efiRSWd�*BPSY3�4:5H

• SMP (Symmetric Multi Processor)• ��efibRmrHP (�SGI) UVRinT0256CPU!�E=q7I���46=4@2

2018/9/25 S^Pm`kOhcmO (1), (I) 19

Page 20: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

���"�� (2)

• ���� #�����

• GPU (NVIDIA Tesla)• PEZY-SC2• Intel Xeon Phi (Knights

Corner)• NEC SX-Aurora

TSUBASA• �

• PCI Express�#�=> ��"���#�

2018/9/25 ���"�!���"� (1), (I) 20

��ITC�GPU�����"�Reedbush-L���

Page 21: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

HWLWGUMRWI4�*?��• TFLOPSYKSVNTJOH%

Tera Floating Point Operations Per SecondZ• \�#6\�7���Y�����Z,\FLOPS&

• KYCTZ8\,[[[Y Z%MYQBZ8\,[[[,[[[Y��Z%GYDBZ8\,[[[,[[[,[[[Y��Z%TYKSZ8\,[[[,[[[,[[[,[[[Y��Z

• 2+=%��#6���7�������7�,)? /5&

• PFLOPSY:IVNTJOHZ• \�#60.\�Y.*Z�7�������7�,)?&

• '�GUMRWI(Y2012�9��"�%11.2PFLOPSZ

2018/9/25 HLGUOTFSPUF (1), (I)

l PC7���8_l 4.2GH`Y\�#642��7ETJE���Z503%;0\ETJE)1>\�7�������,[email protected]&

l Intel Core i7 (Skylake)48%4GA%\ETJE416�7���� �,4-?74%4.2 GHz * 16��������/Hz * 4GA = 268.8 GFLOPS

l Cray-\8160MFLOPS& \^][��7HLGU<>%PC7�,1680�$!X

21

Page 22: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Sg[gRf\cgUD,���• $-\gP�(hTheoretical Peak Performancei

• ZgYMNL�(7GE=9<?�(3

• jPeWPC�)A9H ���# �7G'<?FLOPS�K�6;B8�53

• ��(hEffective Performancei• �G7D_fV`gPT]XMOLK�)<@�)�1K+!3

• >D_fV`gP^eQdbC�JI@5H ���#"'K'3

• ��D�K�C'<?FLOPS�D;B3

• /&���%�D�*_fV`gPA4HLINPACK2���0� (CG

�iD�*_fV`gPA4HHPCG8�:�JI@5H3

• ���#��D_fV`gPF��h�.i

2018/9/25 S[Rf^eQdafQ (1), (I) 22

Page 23: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

0=3=/<49=1'����

• LINPACK6<27=-&'500�)&':<,<.

• #������'� (���)

• ��4�&10�• ����%"�($��

Linpack��

1�500�'�!

500�

1�

�/<49=1

��� (TaihuLight)

��2 (Tianhe-2)�/<49=1

SequoiaTitan

http://www.top500.org/ *+

03/<5;.:8<. (1), (I) 232018/9/25

Page 24: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

LR:4��• �Intel�4���?RFOQLR:(��*."�����4��3�,8�!�#

$���CDI4 �5"&6-TWP 0U�328%

• )9'7�+/"

$J;<NINBD@4�5"&6-TWP 0U�328%

• ��3681"�V�0TS�128#• ��35"EGRF�

2018/9/25 AH>OIN=MKO= (1), (I) 24

Page 25: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

7J?J4IADJ8*EI1I3(1)

• TOP500Khttp://www.top500.org/L• LINPACK*�#-����.��%'�*500�+)*EI1I3

•�0J2F96����M;=5J�>927@G�* Jack Dongarra ��$��

•� 6�@</:* ��ISC!11�@�* ��SC)��K�*""500,�(�&L

2018/9/25 7?4IBH3ECI3 (1), (I) 25

Page 26: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

-E5E+D6?E/!@D(D*(2)

• Green500Fhttp://www.green500.org/G

• Top500 ���%�,-2<!������#���!@D(D*

• Linpack���!��& ����/��=FLOPS/W

• HPCG (http://www.hpcg-benchmark.org/)

•����(CG�) "$8D0:E)

•����!����!Linpack"#=>A').-�93B41) ���!'7A ��

2018/9/25 -5+D7C*@;D* (1), (I) 26

Page 27: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

+<1<);27<,"8;';((3)

• Graph500=http://graph500.org/>• ��(83��&���TEPS (Traversed Edges Per Second)�8;';(

• ���� (BFS)!����2017/11�$SSSP (Single Source Shortest Paths)#��!

• Green Graph500 ��"#�%=�� ?>• IO500 (http://www.io500.org/)

• +.9<*"��=��"����>• 6,-<,�� (IOPS)•0;/ (GB/sec)

• 2017/11�$

2018/9/25 +1);4:(85;( (1), (I) 27

Page 28: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

28

http://www.top500.org/

Site Computer/Year Vendor CoresRmax

(TFLOPS)Rpeak

(TFLOPS)Power(kW)

1 Summit, 2018, USADOE/SC/Oak Ridge National Laboratory

IBM Power System AC922, IBM POWER9 22C 3.07GHz, NVIDIA Volta GV100, Dual-rail Mellanox EDR Infiniband

2,282,544 122,300(= 122.3 PF)

187,659 8,806

2 Sunway TaihuLight, 2016, ChinaNational Supercomputing Center in Wuxi

Sunway MPP, Sunway SW26010 260C 1.45GHz, Sunway 10,649,600 93,015 125,436 15,371

3 Sieera, 2018, USADOE/NNSA/LLNL

IBM Power System S922LC, IBM POWER9 22C 3.1GHz, NVIDIA Volta GV100, Dual-rail Mellanox EDR Infiniband

1,572,480 71,610 119,194

4Tianhe-2A, 2018, ChinaNational Super Computer Center in Guangzhou

TH-IVB-FEP Cluster, Intel Xeon E5-2692v2 12C 2.2GHz, TH Express-2, Matrix-2000 4,981,760 61,445 100,679 18,482

5ABCI (AI Bridging Cloud Infrastructure), 2018, JapanNational Institute of Advanced Industrial Science and Technology (AIST)

PRIMERGY CX2550 M4, Xeon Gold 6148 20C 2.4GHz, NVIDIA Tesla V100 SXM2, Infiniband EDR

391,680 19,880 32,577 1,649

6Piz Daint, 2017, SwitzerlandSwiss National Supercomputing Centre (CSCS)

Cray XC50, Xeon E5-2690v3 12C 2.6GHz, Aries interconnect , NVIDIA Tesla P100 361,760 19,590 25,326 2,272

7 Titan, 2012, USADOE/SC/Oak Ridge National Laboratory

Cray XK7, Opteron 6274 16C 2.200GHz, Cray Gemini interconnect, NVIDIA K20x 560,640 17,590 27,113 8,209

8 Sequoia, 2011, USADOE/NNSA/LLNL

BlueGene/Q, Power BQC 16C 1.60 GHz, Custom 1,572,864 17,173 20,133 7,890

9 Trinity, 2017, USADOE/NNSA/LANL/SNL

Cray XC40, Intel Xeon Phi 7250 68C 1.4GHz, Aries interconnect

979,968 14,137 43,903 3,844

10 Cori, 2016, JapanDOE/SC/LBNL/NERSC

Cray XC40, Intel Xeon Phi 7250 68C 1.4GHz, Aries interconnect 622,336 14,016 27,881 3,939

12Oakforest-PACS, 2016, JapanJoint Center for Advanced High Performance Computing

PRIMERGY CX1640 M1, Intel Xeon Phi 7250 68C 1.4GHz, Intel Omni-Path 556,104 13,556 24,913 2,719

51th TOP500 List (June, 2018) Rmax: Performance of Linpack (TFLOPS)Rpeak: Peak Performance (TFLOPS), Power: kW

2018/9/25 ��������� (1), (I)

Page 29: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

29

http://www.hpcg-benchmark.org/

HPCG Ranking (June, 2018)Computer Cores HPL Rmax

(Pflop/s)TOP500

RankHPCG

(Pflop/s) Peak

1 Summit 2,392,000 122.300 1 2.926 1.5%2 Sierra 835,584 71.610 3 1.796 1.5%3 K computer 705,024 10.510 16 0.603 5.3%4 Trinity 979,072 14.137 9 0.546 1.8%

5 Piz Daint 361,760 19.590 6 0.486 1.9%

6 Sunway TaihuLight 10,649,600 93.015 2 0.481 0.4%

7 Oakforest-PACS 557,056 13.555 12 0.385 1.5%8 Cori 632,400 13.832 10 0.355 1.3%9 Tera-1000-2 522,240 11.965 14 0.334 1.4%10 Sequoia 1,572,864 17.173 8 0.330 1.6%

2018/9/25 ��������� (1), (I)

Page 30: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

30

Green 500 Ranking (June, 2018)TOP 500

Rank System Cores HPL Rmax(Pflop/s)

Power(MW) GFLOPS/W

1 359 Shoubu system B, Japan 794,400 858. 47 18.404

2 419 Suiren2, Japan 762,624 798. 47 16.835

3 385 Sakura, Japan 794,400 825. 50 16.657

4 227 DGX SaturnV Volta, USA 22,440 1,070. 97 15.113

5 1 Summit, USA 2,282,544 122,300. 8,806 13.889

6 19 TSUBAME3.0, Japan 135,828 8,125. 792 13.704

7 287 AIST AI Cloud, Japan 23,400 961. 76 12.681

8 5 ABCI, Japan 391,680 19,880. 1,649 12.054

9 255 MareNostrum P9 CTE, Spain 19,440 1,018. 86 11.865

10 171 RAIDEN GPU, Japan 35,360 1,213. 107 11.363

13 411 Reedbush-L, U.Tokyo, Japan 16,640 806. 79 10.167

19 414 Reedbush-H, U.Tokyo, Japan 17,760 802. 94 8.575

http://www.top500.org/

2018/9/25 ��������� (1), (I)

Page 31: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

2018/9/25 ��������� (1), (I) 31

IO 500 Ranking (Nov., 2017)

Site Computer File system Client nodes IO500 Score BW(GiB/s)

MDkIOP/s)

1 JCAHPC, Japan Oakforest-PACS DDN IME 204 101.48 471.25 21.85

2 KAUST, Saudi Shaheen2 Cray DataWarp 300 70.90 151.53 33.17

3 KAUST, Saudi Shaheen2 Lustre 1000 41.00 54.17 31.03

4 JSC, Germany JURON BeeGFS 8 35.77 14.24 89.83

5 DKRZ, Germany Mistral Lustre 100 32.15 22.77 45.39

6 IBM, USA Sonasad Spectrum Scale 10 21.63 4.57 102.38

7 Fraunhofer, Germany Seislab BeeGFS 24 18.75 5.13 68.58

8 PNNL, USA EMSL Cascade Lustre 126 11.17 4.88 25.57

9 SNL, USA Serrano Spectrum Scale 16 4.25 0.65 27.98

http://www.io500.org/

Page 32: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Top5009;,;- =2018�6�!�>*+�+��+6/;

l 16�"$R-CCS: �.;48<210.5 PFLOPS

l 19����: TSUBAME3.08.125 PFLOPS

l 25,26��'�5.73 PFLOPS x2

l 32� ���?ITO4.54 PFLOPS

l 50� JAXA: SORA-MA3.15 PFLOPS

l 54��(�: Camphor23.05 PFLOPS

l 56����: FX1002.91 PFLOPS

l 61� )�$: JFRS-12.78 PFLOPS

l 77��&�$: FX1002.38 PFLOPS

l 83�%���: ATERUI II2.08 PFLOPS…

• �����+6/;l 180� �$: Sekirei

1.178 PFLOPSl 411�� �#1;2<:

Reedbush-L0.805 PFLOPS

l 414�� �#1;2<: Reedbush-H

0.802 PFLOPSl 436� �$: Sekirei-ACC

0.864 PFLOPS

2018/9/25 03.;5:-97;- (1), (I) 32

Page 33: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Sunway Taihulight�����• ��-��(Wuxi) � �#�,�,�. (NRCPC)• $.� 125.4 PF, Linpack 93.0 PF, 40960!.�• Sunway� SW26010

• 260���'�.��%+���(1+64��)*4�)��, 1.45GHz

• $.��/3.06TF• '(*",��/136.5 GB/s

• �,�� ��/InfiniBand FDR

2018/9/25 �#�,%+�)&,� (1), (I) 33

��/Top500, HPCWire Japan, PCwatch

Page 34: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Piz Daint @ CSCS• ��� ETH Zurich��������� �• ��� 33.8 PF, Linpack 19.5 PF (2017 upgrade)• 5,320 (P100 + Xeon Haswell) + 1,431 Xeon Broadwell

• Cray XC50 + XC40

2018/9/25 �������� (1), (I) 34

���https://www.cscs.ch/publications/news/piz-daint-one-of-the-most-powerful-supercomputers-in-the-world/

Page 35: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Cori @ NERSC• NERSC: ���!-�1�(DoE)/1.0�#1�.1��� �(LBNL)�1��• National Energy Research Scientific Computing Center

• 9,688 Intel Xeon Phi (KNL), %1��� 30 PF + 2,388 Intel Xeon (Haswell)• Cray XC40 �� )

• Gerty Cori: �����*,�����"1'-����

2018/9/25 �$�0&/�+(0� (1), (I) 35

Page 36: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

�#1*.2&4K-Computer5@�� R-CCS• �� ������%1&24��,2' !/1(5

• 8678�9�����

• ;<=:SPARC64 VIIIfx(CPU�� 128GFLOPS)

• 2011�11�TOP500�LINPACK�

• ���:11.280 PLOPS• ���:10.510 PFLOPS ��:93.13

$)#1+0"/-1" (1), (I)

��:������� 4http://www.aics.riken.jp/index.html 5

2018/9/25 36

Page 37: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

����-!-�,"'-�2���$'*-�3

•���������$'*-�3• NEC SX-ACE• 5,120�-�• 0�-����4��• ���20.1PFLOPS

• %&) ,��1.3PB/sec

�!�,#+�($,� (1), (I)

��2 ������.http://www.jamstec.go.jp/es/jp/system/index.html/

2018/9/25 37

Page 38: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

������ TSUBAME3.0• HPE ICE-XA

• CPU: Intel Xeon E5-2680v4 2.4 GHz (14 cores) x 2(Hyperthreading enabled)

• GPU: NVIDIA Tesla P100 x 4

• Memory: 256GB• 540�

� ������� (1), (I)

��http://www.t3.gsic.titech.ac.jp/sites/default/files/guidance.pdf

2018/9/25 38

Page 39: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

��.@/B$-3)@2�$��+-08E6*&'=CGD

39

FY11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Yayoi: Hitachi SR16000/M1IBM Power-7

54.9 TFLOPS, 11.2 TB

Reedbush, HPEBroadwell + Pascal

1.93 PFLOPS

T2K Tokyo140TF, 31.3TB

Oakforest-PACSFujitsu, Intel KNL25PFLOPS, 919.3TB

BDEC System50+ PFLOPS (?)

Oakleaf-FX: Fujitsu PRIMEHPC FX10, SPARC64 IXfx1.13 PFLOPS, 150 TB

Oakbridge-FX136.2 TFLOPS, 18.4 TB

Reedbush-L HPE

1.43 PFLOPS

Oakbridge-IIIntel/AMD/P9/ARM CPU only

5-10 PFLOPS

Big Data & Extreme Computing

������-B3B)@4:B/

92B)%����-B3B)@4:B/C!�� "�F���A��D

1B/� A+7:>B+;@��-B3B)@4:B/

���,;5���������#��-B3B)@4:B/

2018/9/25 -3)@6?(<7@( (1), (I)

Page 40: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

2 (or 4))+-4!��• Oakleaf-FX (��� PRIMEHPC FX10)

• 1.135 PF, �(:05<,��, 2012�4�� 2018�3�• Oakbridge-FX (��� PRIMEHPC FX10)

• 136.2 TF, "�# ��>168�#?, 2014�4�� 2018�3�

• Reedbush (HPE, Intel BDW + NVIDIA P100 (Pascal))• ��ITC�&GPU)+-4, DDN IME (Burst Buffer)• .<,��;)358<)6:�+</<(:05<,>2016�7�� 2020�6�?• Reedbush-U: CPU only, 420 nodes, 508 TF (2016�7�)• Reedbush-H: 120 nodes, 2 GPUs/node: 1.42 PF (2017�3�)

• "�#*61 ����� ���%��+</<(:05<,>2017�10�@2020�6�?RB• Reedbush-L: 64 nodes, 4 GPUs/node: 1.43 PF (2017�10�)

• Oakforest-PACS (OFP) (���$Intel Xeon Phi (KNL))• JCAHPC (���CCS=��ITC)

402018/9/25 +/(:29'73:' (1), (I)

Page 41: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Reedbush����Reedbush-U2016�7�1� ���2016�9�1� ������

Reedbush-H2017�3�1� ���2017�4�3� ������

Reedbush-L2017�10�2� ���2017�11�1� ������

41

Top500: RB-L 291�@Nov. 2017RB-H 203�@Jun. 2017RB-U 361�@Nov. 2016

Green500: RB-L 11�@Nov. 2017RB-H 11�@Jun. 2017

2018/9/25 ����������� (1), (I)

Page 42: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

2018/9/25 ��������� (1), (I) 42

外部接続ルータ 1Gigabit/10Gigabit Ethernet Network

InterConnect( 4x EDR InfiniBand) InterConnect( 4x EDR InfiniBand)

ログインノード群SGI Rackable C1110-GP2

6nodes

NFS Filesystem16TB

Lustre FilesystemDDN SFA14KE x3set

5.04PB

高速キャッシュDDN IME14K x6set

209TB

NAS Storage24TB

E5-2680v4 2.4GHz 14core,256GiB Mem

管理サーバ群SGI Rackable C1110-GP2

9nodes

GbE SW

x6

x6 x2x2(for PBS)

Reedbush-Hx240 (FDRx2/node)

Reebush-Ux420

x36(IME:6x6) x24(OSS(VM):x 12 x2)

x4(MDS:x 2)

x12

高速キャッシュDDN IME240 x8set

153.6 TB

管理用補助サーバ

SGI RackableC1110-GP2 x2

x16(IME:8x2) x2x12

x8 x2x10(Ctrl:8,MDS:2)

Reedbush-Lx128( EDR x2/node) x4

x6 x9x4

x9

x64

x120

x420

x9

Management port

管理コンソールMac Pro

電力管理サーバ

電力計器

Reedbush-USGI Rackable C2112-4GP3420 nodes, 508.03TFLOPS・CPU : E5-2695v4 2.1GHz 18core

Reedbush-HSGI Rackable C1102-GP8120 nodes, 240GPUs, 1.418PFLOPS・CPU : E5-2695v4 2.1GHz 18core ・GPU : NVIDIA Tesla P100 SXM2 x2/node

Reedbush-LSGI Rackable C1102-GP864 nodes, 256GPUs, 1.434PFLOPS・CPU : E5-2695v4 2.1GHz 18core ・GPU : NVIDIA Tesla P100 SXM2 x4/node

E5-2680v4 2.4GHz 14core,128GiB Mem

ライフ/管理ネットワーク 1Gigabit/10Gigabit Ethernet Network

Page 43: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Reedbush������

43

Reedbush-U Reedbush-H Reedbush-L

CPU/node Intel Xeon E5-2695v4 (Broadwell-EP, 2.1GHz, 18core) x 2sockets (1.210 TF), 256 GiB (153.6GB/sec)

GPU - NVIDIA Tesla P100 (Pascal, 5.3TF, 720GB/sec, 16GiB)

Infiniband EDR FDR2ch EDR2ch

�!�� 420 120 64

GPU� - 240 (=1202) 256 (=644)

�!���(TFLOPS) 509 1,417

(145 + 1,272)1,433

(76.8 + 1,358)

���� ��(TB/sec) 64.5 191.2

(18.4+172.8)194.2

(9.83+184.3)

���� 2016.07 2017.03 2017.10

2018/9/25 �� ����� � (1), (I)

Page 44: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Oakforest-PACS (OFP)• 2016�12�1�&�6�• 8,208 Intel Xeon/Phi (KNL)9Q[H�025PFLOPS

• ��4:�+

• TOP 500 9�\��2�]^HPCG 6�\��2�]\2017�11�]

•��(�HPC #�3(JCAHPC: Joint Center for Advanced High Performance Computing)• )"��2*%�$'MYN[• ������ #MYN[

• �����GUYPLC������ #MYN[�B9�!7C�/:��@A>?32<EL[P[JYQV[NKLOTF3.;^��(C�1 8�02* #F�+Z5�<E=DC,-

• http://jcahpc.jp

442018/9/25 LPJYRXIWSYI (1), (I)

Page 45: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Oakforest-PACS ��

��.�(5�=B<J08L:C7H=AKB@:=15�3��26� 5������!>J?L,M�N4& 5��-�*'$�#%>J?L,PO��/�)"��5� �9FJB=

��+�www.jiji.com

452018/9/25 =B<JDI;GEJ; (1), (I) 45

Page 46: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Oakforest-PACSの特徴 (1/2)• ��(8&

• 1(8& 68��<3TFLOPS�8,208(8&= 25 PFLOPS

• /039MCDRAM9��<16GB:;DDR49��<96GB::

• (8&���

• +4)�#� 17)7&����Fat-Tree'$%68�

• � ����-3�8 17����<�!1,��

• Intel Omni-Path Architecture

462018/9/25 "*�7-5�2.7� (1), (I) 46

Page 47: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Oakforest-PACS �� 47%,!9/8 519 (1), (I)2018/9/25

�-:����� 25 PFLOPS*:)� 8,208��*:)

Product ��� PRIMERGY CX600 M1 (2U) + CX1640 M1 x 8node

/8&'" Intel® Xeon Phi™ 7250;��!:): Knights Landing<68 !��1.4 GHz

236 �+9) 16 GB, MCDRAM,�� 490 GB/sec�+9) 96 GB, DDR4-2400, -:� 115.2

GB/sec�����

Product Intel® Omni-Path Architecture69�� 100 Gbps(08$ .7+�&�#49+9)Fat-tree�

47

Page 48: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Oakforest-PACS '� H2 / 2I• <,-CI/O

• ��<,-C236?:Lustre 26PB

• <,-C.@52A236?HDDN IMEIJ1TB/sec+ #*���,�1PB• ����F;507G4��F����&(��

• ��!�• Green 500%(��6�• LinpackJ 2.72 MW

• 4,986 MFLOPS/WHOFPI• 830 MFLOPS/WH�I

��<,-C236?

<,-C.@52A236?

B5/�$)1209G8'"��

482018/9/25 3:1E=D0B>E0 (1), (I) 48

Page 49: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Oakforest-PACS ���1 �2��&��.� ")

Type Lustre File System��� 26.2 PBProduct DataDirect Networks SFA14KE�$0# 500 GB/sec

��&��.�*!�+� ")

Type Burst Buffer, Infinite Memory Engine (by DDN)

��� 940 TB (NVMe SSD,%-"����)Product DataDirect Networks IME14K�$0# 1,560 GB/sec

����� 4.2MW1�����2�,!� 102

492018/9/25 %�0'/�,(0� (1), (I) 49

Page 50: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Oakforest-PACS >RZTIJG• OS: Red Hat Enterprise Linux (fLHgWiU)-

CentOS 1C? McKernel (#�WiU-���!)• McKernel: ��AICS:*��>`ViNG 4OS

• Linux��-Linux=�@&)-biO\fLd_=�0D�+<7

•]QT�NgYaiS=A�'6ED��.• NgXHdjGCC, Intel Compiler, XcalableMP

• XcalableMP: ��AICS;���:��*��>�\fLd^gL"%

• CBFortran:$(6E9NiU=���F0D5;:-�!>,/�G\eMiPcgF �=*�8D5;2:3D.

• dH[dehG\eMiPcgj Ki\gRiQRZTIJG• ppOpen-HPC, OpenFOAM, ABINIT-MP, PHASE system, FrontFlow/blue,

LAPACK, ScaLAPACK, PETSc, METIS, SuperLU etc.

502018/9/25 QXNg\fLd^gL (1), (I) 50

Page 51: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

FX10� ,����

Memory Memory Memory

�012�����

Core#1

Core#2

Core#3

Core#0

2018/9/25 �!�+"*�(#+� (1), (I) 51

-������

Core#13

Core#14

Core#15

Core#12…

L2 (16������12MB)

L1 L1 L1 L1 L1 L1 L1 L1: L1�,��&��'32KB

85GB/=(8Byte�1333MHz�8 channel)

DDR3 DIMMMemory

4GB �2� 4GB �2� 4GB �2� 4GB �2�

,����$%) .8GB�4/32GB

20GB/

TOFUNetwork

ICC

Page 52: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

FX10����"$TOFU��#2018/9/25 ��� ����� � (1), (I)

52

�!� �!�

�!��!�

�!�

�!��!�

�!��!�

�!�

�!�

�!�

$TOFU��

6����5GB/�"���#

���!��

1TOFU���� �

�!�

Page 53: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

FX10&���=?TOFU���& �>2018/9/25 03-;4:,95;, (1), (I) 53

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

?TOFU��

@��

l8<.�*�+#�X��Y��Z�% �!��&1TOFU#���&?TOFU'����!)�(�=@�2<90�>l ����� �"'

l X�'2<90l Y�'61/7l Z�'61/7(�'�2<90

%$�!�(�

Page 54: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Reedbush-U�#� �!���• �� �������� ������� ��

=> NUMA (Non-Uniform Memory Access)(FX10�����)

Intel Xeon E5-2695 v4 (Broadwell-

EP)

QPI76.8GB/s

76.8GB/s

IB EDRHCA

15.7 GB/s

DDR4��

128GB

76.8GB/s 76.8GB/s

Intel Xeon E5-2695 v4 (Broadwell-

EP)QPIDDR4DDR4DDR4

DDR4DDR4DDR4DDR4

�� 128GB

G3 x16

2018/9/25 ���"�!���"� (1), (I) 54

Page 55: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Broadwell-EP��

Memory Memory Memory

2018/9/25 ���' &�$!'� (1), (I) 55

)��������

76.8 GB/�=(8Byte 2400MHz 4 channel)

DDR4DIMM Memory

16GB 2� 16GB 2� 16GB 2� 16GB 2�

�������"#%�*16GB 8+128GB

Core#0

L1

L2 L3

Core#1

L1

L2 L3

Core#2

L1

L2 L3

Core#3

L1

L2 L3

Core#4

L1

L2 L3

Core#5

L1

L2 L3

Core#6

L1

L2 L3

Core#7

L1

L2 L3

Core#8

L1

L2 L3

Core#9

L1

L2 L3

Core#10

L1

L2 L3

Core#11

L1

L2 L3

Core#12

L1

L2 L3

Core#13

L1

L2 L3

Core#14

L1

L2 L3

Core#15

L1

L2 L3

Core#16

L1

L2 L3

Core#17

L1

L2 L3

QPI x2 PCIe �����L1�(�: 2KB, L2: 256KB, L3: 2.5MB(��) => L3 ����45MB

Page 56: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Reedbush-U(���• >D<0614BF<F:.�#Fat Tree�

• &(,�'��;G:.�/$+��'���$�� ��

• Mellanox InfiniBand EDR 4x CS7500: 648@G9• ��)36@G95087 (SB7800). (36+18)��*-!"+(%��• RB-H)+�1 �RB-L%)���'��

2018/9/25 5=3F?E2CAF2 (1), (I) 56

181 3619 5437

Downlink: 18

. . . . . . . . .

Uplink: 18

. . .

. . .

36@G9Leaf 5087

36�

36@G9Spine5087

18�

648@G9Director5087 1�(��

Page 57: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Reedbush-H��������

NVIDIA Pascal

NVIDIA Pascal

NVLinK20 GB/s

Intel Xeon E5-2695 v4 (Broadwell-

EP)

NVLinK20 GB/s

QPI76.8GB/s

76.8GB/s

IB FDRHCA

G3

x16 15.7 GB/s 15.7 GB/s

DDR4���128G

B

EDR switch

EDR

76.8GB/s 76.8GB/s

Intel Xeon E5-2695 v4 (Broadwell-

EP)QPIDDR4DDR4DDR4

DDR4DDR4DDR4DDR4

���128G

B

PCIe swG

3 x16

PCIe sw

G3

x16

G3

x16

G3 x16

G3 x16

IB FDRHCA

2018/9/25 ������� �� (1), (I) 57

Page 58: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Oakforest-PACS ��'�• Intel Xeon Phi (Knights Landing)

• 1�'�1����

• MCDRAM: �&���'����&��!"$16GB

+ DDR4!"$

2018/9/25 ���&�%�# &� (1), (I)

Knights Landing Overview

Chip: 36 Tiles interconnected by 2D Mesh Tile: 2 Cores + 2 VPU/core + 1 MB L2 Memory: MCDRAM: 16 GB on-package; High BW DDR4: 6 channels @ 2400 up to 384GB IO: 36 lanes PCIe Gen3. 4 lanes of DMI for chipset Node: 1-Socket only Fabric: Omni-Path on-package (not shown) Vector Peak Perf: 3+TF DP and 6+TF SP Flops Scalar Perf: ~3x over Knights Corner Streams Triad (GB/s): MCDRAM : 400+; DDR: 90+

TILE

4

2 VPU

Core

2 VPU

Core

1MB L2

CHA

Package

Source Intel: All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice. KNL data are preliminary based on current expectations and are subject to change without notice. 1Binary Compatible with Intel Xeon processors using Haswell Instruction Set (except TSX). 2Bandwidth numbers are based on STREAM-like memory access pattern when MCDRAM used as flat memory. Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Omni-path not shown

EDC EDC PCIe Gen 3

EDC EDC

Tile

DDR MC DDR MC

EDC EDC misc EDC EDC

36 Tiles connected by

2D Mesh Interconnect

MCDRAM MCDRAM MCDRAM MCDRAM

3

DDR4

CHANNELS

3

DDR4

CHANNELS

MCDRAM MCDRAM MCDRAM MCDRAM

DMI

2 x161 x4

X4 DMI

HotChips27

KNL�#����

Knights Landing: Next Intel® Xeon Phi™ Processor

First self-boot Intel® Xeon Phi™ processor that is binary compatible with main line IA. Boots standard OS.

Significant improvement in scalar and vector performance

Integration of Memory on package: innovative memory architecture for high bandwidth and high capacity

Integration of Fabric on package

Potential future options subject to change without notice. All timeframes, features, products and dates are preliminary forecasts and subject to change without further notification.

Three products

KNL Self-Boot KNL Self-Boot w/ Fabric KNL Card

(Baseline) (Fabric Integrated) (PCIe-Card)

Intel® Many-Core Processor targeted for HPC and Supercomputing

2 VPU 2 VPU

Core Core

1MB

L2

MCDRAM: 490GB/��� (��)DDR4: 115.2 GB/�=(8Byte�2400MHz�6 channel)

����� �!"$*16GB�6+96GB

58

Page 59: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Oakforest-PACS: Intel Omni-Path Architecture %)+<C:-4.1AE:E8Fat-tree�

768 port Director

Switch

12�(Source by Intel)

48 port Edge Switch

362 �

22

241 4825 7249

Uplink: 24

Downlink: 24

. . . . . . . . .

037' +!<C:-4.1AE:E8,��• 136@�����%(���� �,��• ��$��G2A=%"+��9F8�*�#&���!��

592018/9/25 3;0E>D/B?E/ (1), (I)

��9F8B5.%��

Page 60: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

�����!N`PaOakforest-PACSMaVaJ`W[aPLMQZ?�,'b0./2�1�/��c

• VaOT^JaMb�-c• 100,000� d 1�8UaS(��)4��512UaSA;�&�• 3�A;

• I^aXJaM• 400,000� (�� 480,000�) d 1� 8UaSb��c4��2048UaSA;

• ��@45RaH`6;+ • 9*BUaS�3360�324�-?5RaH`68�7DFE

• VaOT^JaM@2UaS"�• ��UaSA;@4RaH`�(��81.0• ��UaSG)7E<4)7:@4�(��82.0>=E• ��$?\aK@Reedbush<?"�RaH`#&C�%

2018/9/25 MVJ`X_I]Y`I (1), (I) 60

Page 61: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

�����!K]M^ReedbushJ^S^G]TX^MIJNW;�,'_2018�4�1�`

• S^LQ[G^J_�-`• 150,000� a RB-U: 4R^P_��`/��16R^P=7

RB-H: 1R^P_��`/�(�� 2.5xRB-L: 1R^P_��`/�(�� 4.0x

• F[^UG^J• 300,000�a 1� 4R^P_��`/��128R^P=7/

RB-Ha 1R^P_��`/O^E]��<U;2.5�RB-La 1R^P_��`/O^E]��<U;4.0�

• RB-U;>�� 360,000� a 1� 4R^P_��`/��128R^P=7• RB-H;>�� 216,000� a 1� 1R^P_��`• RB-L;>�� 360,000� a 1� 1R^P_��`

• ��</0O^E]17+ • 5*>R^P�.360�.24�-;0O^E]14�3@CB• ��R^P=7</O^E]�(��41.0 (H<2.5, L< 4.0)• ��R^PD)3B8/)36</�(��42�:9B• ��$;Y^H<Oakforest-PACS8;"�O^E]#&?�%• R^P��?2A

2018/9/25 JSG]U\FZV]F (1), (I) 61

Page 62: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

JPY (=Watt)/GFLOPS RateSmaller is better (efficient)

62

System JPY/GFLOPSOakleaf/Oakbridge-FX (Fujitsu)(Fujitsu PRIMEHPC FX10) 125

Reedbush-U (SGI)(Intel BDW) 62.0

Reedbush-H (SGI)(Intel BDW+NVIDIA P100) 17.1

Oakforest-PACS (Fujitsu)(Intel Xeon Phi/Knights Landing) 16.5

2018/9/25 ��������� (1), (I)

Page 63: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

kpg|q{fwt|fW�7

v}iVTQR�

• �8U:3d^`0!AV5c_Oa

• :3VFFagkod^`�ISHa

• �M!AD�3S^`�IW:3GSHa

• kpg|Wjknu*�d-9LD*�V�cOPq{fwt|f

• �zryW���d�H�NDeyhxluW��

?.�VTQR]�

• <@Uxm}kUWSD#�V�QRZLE• RB-UX,!)S90%>IW .+~,�/VXZ[($�

• 100�Wsj|W .�+d10%�JbY10��LPKTV

2018/9/25 kpg|q{fwt|f (1), (I)

• OFPW��• � �"C1.1�D

5��6CS72.2�D2'�T��

• B%���A44�~1;D&=\�D3.2 MWI_E

63

Page 64: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

�RYGVTZG;@^• "���?RYGVU\���&T ]D)_�?���D�89)T / _ >5C2;*

• ���-:@��*• �'@):0C.<,.@)����?��\EXIWKU]: 01 (43/$,• EXIWKU�)��>�:0=+%�?��• #�?6A?F[PSMO?��

• #��7�/B�&• N[L !�&

JQHZRYGVTZG (1), (I) 64

T

T / _

2018/9/25

Page 65: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

MPI27• Message Passing Interface

• ULJ\G%��6W@QWX �

• TCP/IP6KBLN%�435��_./2"$7��• � �!�+�

• RZJH�6)��FHMT]Massively Parallel Processing (MPP)FHMT^5�*;

• �����RZJH81 #5�*:<0);

• ��+��• API]Application Programming Interface^6���

• HB\WPXM?(��+')• %���=V\E+"$-;,259;>YDXIT6�&�+�

2018/9/25 HOC[RZAWS[A (1), (I) 65

Page 66: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

Oakforest-PACS �����

2018/9/25 ��� ����� (1), (I) 66

Page 67: 1), (I) · 2018. 9. 25. · ? 9 • v $ tSQXOKTWu •:*v • 1993 3& ; 0 AD+ 3 • 1995 3& ; C0 2534 • 1998 3& ; C0 2534 • 1998 4&' 3 A 3? • 2002

��� ���������

2018/9/25 ����������� (1), (I) 67