*Papers written in English [#qa49ab1b] **Journals and Transactions [#gaa29794] +Masahiro Nakao, Hitoshi Murai, Hidetoshi Iwashita, Taisuke Boku, Mitsuhisa Sato, "Implementation and evaluation of the HPC challenge benchmark in the XcalableMP PGAS language", Int. Journal of High Performance Computing Applications, doi:10.1177/1094342017698214, 14 pages, Mar. 2017. &br;&br; +Y. Hasegawa, J. Iwata, M. Tsuji, D. Takahashi, A. Oshiyama, K. Minami, T. Boku, H. Inoue, Y. Kitazawa, I. Miyoshi, and M. Yokokawa: Performance evaluation of ultra-large-scale first-principles electronic structure calculation code on the K computer, International Journal of High Performance Computing Applications, August 2014, 28: 335-355, doi:10.1177/1094342013508163, 2014. &br;&br; +M. Noda, K. Ishimura, K. Nobusada, K. Yabana, T. Boku: Massively-parallel electron dynamics calculations in real-time and real-space: Toward applications to nanostructures of more than ten-nanometers in size, Journal of Computational Physics, Vol.265, pp.145-155, 2014. &br;&br; +J. Iwata, D. Takahashi, A. Oshiyama, T. Boku, K. Shiraishi, S. Okada and K. Yabana: A massively-parallel electronic-structure calculations based on real-space density functional theory, Journal of Computational Physics, Vol. 229, No. 6, pp. 2339-2363, 2010. &br;&br; +K. Nakazawa, H. Nakamura, T. Boku, I. Nakata and Y. Yamashita, "CP-PACS: A massively parallel processor at the University of Tsukuba", Parallel Computing, Vol. 25, pp.1635-1661, 1999. &br;&br; +A. Murata, T. Boku, and H. Amano, "The MDX (Multi-Dimensional X'bar): A Class of Networks for Large Scale Multiprocessors" IEICE Trans. on Information and Systems, Vol.E79-D, No.8, 1996. &br;&br; +W. G. Hoover, A. J. De Groot, C. G. Hoover, I. F. Stowers, T. Kawai, B. L. Holian, T. Boku, S. Ihara, and J. Belak, "Large-scale elastic-plastic indentation simulations via nonequilibrium molecular dynamics", Physical Review A, Vol.42, No.10, pp.5844-5853, 1990. &br;&br; +H. Amano, T. Boku, and T. Kudoh, "(SM)^2: A Large-Scale Multiprocessor for Sparse Matrix Calculations", IEEE Transactions on Computer, Vol.39, No.7, pp.889-905, 1990. **Conference Proceedings [#nac09ce5] +Norihisa Fujita, Ryohei Kobayashi, Yoshiki Yamaguchi and Taisuke Boku, "Parallel Processing on FPGA Combining Computation and Communication in OpenCL Programming", Proc. of AsHES2019 (Int. Workshop on Acceleraors and Hybrid Exascale Systems) in IPDPS 2019, Rio de Janeiro, May 2019.&br;&br; +Ryohei Kobayashi, Norihisa Fujita, Yoshiki Yamaguchi, Ayumi Nakamichi and Taisuke Boku, "GPU-FPGA Heterogeneous Computing with OpenCL-enabled Direct Memory Access", Proc. of AsHES2019 (Int. Workshop on Acceleraors and Hybrid Exascale Systems) in IPDPS 2019, Rio de Janeiro, May 2019.&br;&br; +Miwako Tsuji, Taisuke Boku, Mitsuhisa Sato, “Scalable Communication Performance Prediction Using Auto-Generated Pseudo MPI Event Trace.”, Proc. of HPC Asia 2019, Guangzhou, Jan. 15th, 2019.&br;&br; +Ryohei Kobayashi, Norihisa Fujita, Yoshiki Yamaguchi, Taisuke Boku, “OpenCL-enabled high performance direct memory access for GPU-FPGA cooperative computation”, Proc. of IXPUG Workshop Asia 2019 (in HPC Asia 2019), Guangzhou, Jan. 14th, 2019.&br;&br; +Yuta Hirokawa, Taiuske Boku, Mitsuharu Uematsu, Shunsuke A. Sato, Kazuhiro Yabana, "Performance Optimization and Evaluation of Scalable Optoelectronics Application on Large Scale KNL Cluster", Proc. of ISC2018 (Int. Symposium on Supercomputing), 20 pages, Frankfurt, Jun. 26th 2018.&br;&br; +Norihisa Fujita, Ryohei Kobayashi, Taisuke Boku, Yuma Oobata, Yoshiki Yamaguchi, Kohji Yoshikawa, Makino Abe, Masayuki Umemura, "Accelerating Space Radiative Transfer on FPGA using OpenCL", Proc. of HEART2018 (Int. Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies), Toronto, Jun. 21st 2018.&br;&br; +Yuta Hirokawa, Taiuske Boku, Shunsuke A. Sato, Kazuhiro Yabana, "Performance Evaluation of Large Scale Electron Dynamics Simulation under Many-core Cluster based on Knights Landing", Proc. of HPC Asia 2018 (Int. Conference on High Performance Computing in Asia-Pacific Region), 9 pages, Tokyo, Jan. 30th 2018.&br;&br; +Ryohei Kobayashi, Yuma Oobata, Norihisa Fujita, Yoshiki Yamaguchi, Taisuke Boku, "OpenCL-ready High Speed FPGA Network for Reconfigurable High Performance Computing", Proc. of HPC Asia 2018 (Int. Conference on High Performance Computing in Asia-Pacific Region), 8 pages, Tokyo, Jan. 30th 2018.&br;&br; +Masahiro Nakao, Hitoshi Murai, Hidetoshi Iwashita, Akihiro Tabuchi, Taisuke Boku, Mitsuhisa Sato, "mplementing Lattice QCD Application with XcalableACC Language on Accelerated Cluster", Proc. of IEEE Cluster2017, Hawaii, Sep. 2017. &br;&br; +Akihiro Tabuchi, Masahiro Nakao, Hitoshi Murai, Taisuke Boku, Mitsuhisa Sato, "Implementation and Evaluation of One-sided PGAS Communication in XcalableACC for Accelerated Clusters", Proc. of CCGrid2017, Madrid, May 15th 2017.&br;&br; +Kenta Sato, Norihisa Fujita, Toshihiro Hanawa, Taisuke Boku, Khaled Z. Ibrahim, "GPU-ready GASNet Implementation on the TCA Proprietary Interconnect Architecture", Proc. of CSCI2016 (Int. Conf. on Computational Science and Computational Intelligence 2016), 6 pages, Las Vegas, Dec. 2016.&br;&br; +Akihiro Tabuchi, Yasuyuki Kimura, Sunao Torii, Video Matsufuru, Tadashi Ishikawa, Taisuke Boku, Mitsuhisa Sato, "Design and Preliminary Evaluation of Omni OpenACC Compiler for Massive MIMD Processor PEZY-SC", Proc. of IWOMP2016 (International Workshop on OpenMP (LNCS 9903: OpenMP: Memory, Devices, and Tasks), pp.293-305, Nara, Oct. 2016.&br;&br; +Kazuya Matsumoto, Norihisa Fujita, Toshihiro Hanawa, ,Taisuke Boku, “Implementation and Evaluation of NAS Parallel CG Benchmark on GPU Cluster with Proprietary Interconnect TCA”, Proc. of VECPAR2016, 8 pages, Porto, Jul. 2016.&br;&br; +Yuta Hirokawa, Taisuke Boku, Shunsuke Sato, Kazuhiro Yabana, "Electron Dynamics Simulation with Time-Dependent Density Functional Theory on Large Scale Symmetric Mode Xeon Phi Cluster", Proc. of PDSEC2016 (in IPDPS2016), 8 pages, Chicago, 2016.&br;&br; +Tetsuya Odajima, Taisuke Boku, Toshihiro Hanawa, Hitoshi Murai, Masahiro Nakao, Akihiro Tabuchi, Mitsuhisa Sato, "Hybrid Communication with TCA and InfiniBand on A Parallel Programming Language for Accelerators XcalableACC", Proc. of HUCAA2015 (in Cluster2015), 8 pages, Chicago, Sept. 2015.&br;&br; +Toshihiro Hanawa, Norihisa Fujita, Tetsuya Odajima, Kazuya Matsumoto, Taisuke Boku, "Evaluation of FFT for GPU Cluster Using Tightly Coupled Accelerators Architecture", Proc. of HUCAA2015 (in Cluster2015), 8 pages, Chicago, Sept. 2015.&br;&br; +Toshihiro Hanawa, Hisafumi Fujii, Norihisa Fujita, Tetsuya Odajima, Kazuya Matsumoto, Yuetsu Kodama, Taisuke Boku, "Improving Strong-Scaling on GPU Cluster Based on Tightly Coupled Accelerators Architecture", Proc. of IEEE Cluster2015, Chicago, Sept. 2015.&br;&br; +Kazuya Matsumoto, Toshihiro Hanawa, Yuetsu Kodama, Hisafumi Fujii, Taisuke Boku, "Implementation of CG Method on GPU Cluster with Proprietary Interconnect TCA for GPU Direct Communication", Proc. of AsHES2015 in IPDPS2015, Hyderabad, May 2015. &br;&br; +K. Tsugane, H. Nuga, T. Boku, H. Murai, M. Sato, W. Tang, B. Wang, "Hybrid-view Programming of Nuclear Fusion Simulation Code in the PGAS Parallel Programming Language XcalableMP", Proc. of ICPADS2014, Hsinchu, Dec. 2014.&br;&br; +Masahiro Nakao, Hitoshi Murai, Takenori Shimosaka, Akihiro Tabuchi, Toshihiro Hanawa, Yuetsu Kodama, Taisuke Boku, Mitsuhisa Sato. ``XcalableACC: Extension of XcalableMP PGAS Language using OpenACC for Accelerator Clusters,'' Workshop on accelerator programming using directives (WACCPD), New Orleans, LA, USA, Nov., 2014.&br;&br; +N. Fujita, H. Fujii, T. Hanawa, Y. Kodama, T. Boku, Y. Kuramashi, M. Clark, "QCD Library for GPU Cluster with Proprietary Interconnect for GPU Direct Communication", Proc. of HeteroPar 2014 (with EuroPar 2014), Porto, 2014.&br;&br; +Y. kodama, T. Hanawa, T. Boku, M. Sato, "PEACH2: FPGA based PCIe network device for Tightly Coupled Accelerators", Proc. of HEART2014, Sendai, 2014. (receiving Best Paper Award of HEART2014)&br;&br; +N. Fujita, H. Nuga, T. Boku, Y. Idomura, "Nuclear Fusion Simulation Code Optimization and Performance Evaluation on GPU Clusters", Proc. of PDSEC2014 (with IPDPS2014), Phoenix, 2014.&br;&br; +T. Odajima, T. Boku, M. Sato, T. Hanawa, Y. Kodama, R. Namyst, S. Thibault, O. Aumage, "Adaptive Task Size Control on High Level Programming for GPU/CPU Work Sharing", Proc. of Int. Workshop on Advances of Distributed and Parallel Processing 2013 (ADPC-2013, with ICA3PP-2013), Vietri sul Mare, LNCS-8286 Part II, pp.59-68, 2013. &br;&br; +T. Hanawa, Y. Kodama, T. Boku, M. Sato, "Interconnect for Tightly Coupled Accelerators Architecture", Proc. of HotInterconnect 2013, San Jose, 2013.&br;&br; +T. Hanawa, Y. Kodama, T. Boku, M. Sato, "Tightly Coupled Accelerators Architecture for Minimizing Communication Latency among Accelerators", Proc. of 3rd Int. Workshop on Accelerators and Hybrid Exascale Systems (AsHES 2013, with IPDPS2013), Boston, CD-ROM, 2013. &br;&br; +T. Odajima, T. Boku, T. Hanawa, J. Lee, M. Sato, "GPU/CPU Work-Sharing with Parallel Language XcalableMP-dev for Parallelized Accelerated Computing", Proc. of P2S2-2012 (with ICPP2012), Pittsburgh, CD-ROM, 2012. &br;&br; +T. Nomizu, D. Takahashi, J. Lee, T. Boku, M. Sato, "Implementation of XcalableMP Device Acceleration Extention with OpenCL", Proc. of PLC2012 (with IPDPS2012), Shanghai, CD-ROM, 2012. &br;&br; +M. Nakao, J. Lee, T. Boku, M. Sato, "Productivity and Performance of Global-View Programming with XcalableMP PGAS Language", Proc. in CCGrid2012, Ottawa, CD-ROM. &br;&br; +S. Otani, H. Kondo, I. Nonomura, A. Ikeya, M. Uemura, Y. Hayakawa, T. Oshita, S. Kaneko, K. Asahina, K. Arimoto, S. Miura, T. Hanawa, T. Boku, M. Sato, "An 80Gb/s Dependable Communication SoC with PCI Express I/F and 8 CPUs", Proc. of ISSCC2011, San Francisco, CD-ROM, 2011. &br;&br; +J. Lee, M. T. Tran, T. Odajima, T. Boku, M. Sato, "An Extension of XcalableMP PGAS Language for Multi-node GPU Clusters," Ninth International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (HeteroPar 2011), 2011. &br;&br; +T. Hanawa, T. Boku, S. Miura, M. Sato, K. Arimoto, "PEARL and PEACH: A Novel PCI Express Direct Link and Its Implementation," The Seventh Workshop on High-Performance, Power-Aware Computing (HPPAC 2011) in 25th International Parallel and Distributed Processing Symposium (IPDPS 2011), pp. 866-874, 2011. &br;&br; +S. Miura, T. Hanawa, T. Boku, M. Sato, "XMCAPI: Inter-core Communication Interface on Multi-chip Embedded Systems," Proc. of EUC 2011, pp.397-402. &br;&br; +T. Hanawa, T. Boku, S. Miura, M. Sato, K. Arimoto, ''PEARL: Power-aware, Dependable, and High-Performance Communication Link Using PCI Express", Proc. of IEEE/ACM International Conference on Green Computing and Communitations (GreenCom2010), pp. 284-291, Hangzhou, 2010. &br;&br; +T. Hanawa, T. Boku, S. Miura, M. Sato, and K. Arimoto, "Power-aware, Dependable, and High-Performance Communication Link Using PCI Express: PEARL," Proc. of IEEE International Conference on Cluster Computing (Cluster2010), poster, 4 pages, Creta Island, Sep. 2010. &br;&br; +M. Nakao, J. Lee, T. Boku, M. Sato, "XcalableMP Implementation and Performance of NAS Parallel Benchmarks", Proc. of PGAS10, New York, 2010. &br;&br; +T. Yonemoto, S. Miura, T. Hanawa, T. Boku, M. Sato, "Flexible Multi-link Ethernet Binding System for PC Clusters with Asymmetric Topology", Proc. of ICPADS2009, Memory-card, Shinzen, 2009. &br;&br; +T. Hanawa, M. Sato, J. Lee, T. Imada, H. Kimura, T. Boku, "Evaluation of Multicore Processor for Embedded Systems by Parallel Benchmark Program using OpenMP", Proc. of IWOMP2009, Dresden, 2009. &br;&br; +S. Miura, T. Hanawa, T. Yonemoto, T. Boku, M. Sato, "RI2N/DRV: Multi-link Ethernet for High-Bandwidth and Fault-Tolerant Network on PC Cluster", Proc. of CAC2009 (included in Proc. of IPDPS2009), CD-ROM, Rome, 2009. &br;&br; +J. Lee, M. Sato, T. Boku, "OpenMPD: A Directive-Based Data Parallel Language Extension for Distributed Memory Systems", Proc. of 1st Int. Workshop on Parallel Programming Models and System Software for High-End Computing (P2S2) (included in Proc. of ICPP08), Portland, 2008. &br;&br; +S. Miura, T. Boku, T. Okamoto, T. Hanawa, "A Dynamic Routing Control System for High-Performance PC Cluster with Multi-path Ethernet Connection", Proc. of CAC2008 (included in Proc. of IPDPS2008), CD-ROM, Miami, 2008. &br;&br; +J. Lee, M. Sato, T. Boku, "Design and Implementation of OpenMPD: An OpenMP-like Programming Language for Distributed Memory Systems", Proc. of Int. Workshop on OpenMP (IWOMP2007), Beijing, 2007. &br;&br; +T. Okamoto, S. Miura, T. Boku, M. Sato, D. Takahashi, "RI2N/UDP: High bandwidth and fault-tolerant network for a PC-cluster based on multi-link Ethernet", Proc. of CAC2007 (included in Proc. of IPDPS2007), CD-ROM, Long Beach, 2007. &br;&br; +T. Okamoto, T. Boku, M. Sato, T. Osamu, "P2P Overlay Network for TCP Programming with UDP Hole Punching", Proc. of NPC2006, Tokyo, 2006. &br;&br; +H. Kimura, M. Sato, Y. Hotta, T. Boku, D. Takahashi, "Empirical Study on Reducing Energy of Parallel Programs using Slack Reclamation by DVFS", Proc. of Cluster2006, Barcelona, 2006. &br;&br; +S. Sumimoto, K. Ooe, K. Kumon, T. Boku, M. Sato, A. Ukawa, "Scalable Communication Layer for Multi-Dimensional Crossbar Network Using Multiple Gigabit Ethernet", Proc. of ICS2006, Cairns, Australia, 2006. &br;&br; +T. Boku, M. Sato, A. Ukawa, D. Takahashi, S. Sumimoto, K. Kumon, T. Moriyama, M. Shimizu, "PACS-CS: A large-scale bandwidth-aware PC cluster for scientific computations", Proc. of CCGrid2006, Singapore, 2006. &br;&br; +T. Boku, M. Sato, D. Takahashi, H. Nakashima, H. Nakamura, S. Matsuoka, Y. Hotta, "MegaProto?/E: Power-Aware High-Performance Cluster with Commodity Technology", Proc. of HP-PAC06 (in IPDPS2006), Rhodes, Greece, 2006. &br;&br; +S. Miura, T. Okamoto, T. Boku, M. Sato, D. Takahashi, "Low-cost High-bandwidth Tree Network for PC Clusters based on Tagged-VLAN Technology", Proc. of ISPAN2005, pp.84-91, Las Vegas, USA, 2005. &br;&br; +H. Nakashima, H. Nakamura, M. Sato, T. Boku, S. Matsuoka, D. Takahashi, Y. Hotta, "MegaProto?: 1 TFlops/10kW Rack Is Feasible Even with Only Commodity Technology", Proc. of SC05 (CD-ROM), Seattle, USA, 2005. &br;&br; +T. Boku, K. Onuma, M. Sato, Y. Nakajima, D. Takahashi, "Grid environment for computational astrophysics driven by GRAPE-6 with HMCS-G and OmniRPC", Proc. of Joint Workshop on High-Performance Grid Computing and High-Level Parallel Programming Models, IPDPS2005, Denver, USA, 2005. &br;&br; +H. Nakashima, H. Nakamura, M. Sato, T. Boku, S. Matsuoka, D. Takahashi, Y. Hotta, "MegaProto?: A Low-Level and Compact Cluster for High-Performance Computing", Proc. of Workshop on High Performance Power Aware Computing, IPDPS2005, Denver, USA, 2005. &br;&br; +Y. Ojima, M. Sato, T. Boku, D. Takahashi, "Design of Software Distributed Shared Memory System using MPI communication layer", Proc. of 4th International Workshop on OpenMP Experiences and Implementations (WOMPEI2005), Tsukuba, Japan, 2005. &br;&br; +T. Boku, M. Sato, M. Matsubara, D. Takahashi, "OpenMPI - OpenMP like tool for easy programming in MPI", Proc. of 6th European Workshop on OpenMP (EWOMP'04), Stockholm, Sweden, 2004. &br;&br; +C. Takahashi, M. Kondo, T. Boku, D. Takahashi, H. Nakamura, "SCIMA-SMP: On-chip memory processor architecture for SMP", Proc. 3rd Workshop on Memory Performance Issues (WMPI-2004), pp.121-128, Munich, Germany, 2004. &br;&br; +Y. Ohtaki, D. Takahashi, T. Boku, M. Sato, "Parallel Implementation of Strassen's Matrix Multiplication Algorithm for Heterogeneous Clusters", Proceedings of Heterogeneous Computing Workshop 2004 in IPDPS2004, Santa Fe, USA, 2004. &br;&br; +Y. Hotta, M. Sato, T. Boku, D. Takahashi, C. Takahashi, "Measurement and Characterization of Power Consumption of Microprocessors for Power-aware Cluster", Proceedings of CoolChips? VII, Yokohama, Japan, 2004. &br;&br; +K. Onuma, T. Boku, M. Sato, D. Takahashi, H. Susa, M. Umemura, "Heterogeneous Remote Computing System for Computational Astrophysics with OmniRPC", Proceedings of International Workshop on High Performance Grid Computing and Networking, 2004 International Symposium on Applications and Internet, Tokyo, Jan. 2004. &br;&br; +Y. Nakajima, M. Sato, T. Boku, D. Takahashi, H. Gotoh, "Performance Evaluation of OmniRPC in a Grid Environment", Proceedings of International Workshop on High Performance Grid Computing and Networking, 2004 International Symposium on Applications and Internet, Tokyo, Jan. 2004. &br;&br; +S. Miura, T. Boku, M. Sato, D. Takahashi, "RI2N - Interconnection network system for clusters with wide-bandwidth and fault-tolerancy based on multiple links", Proceedings of International Symposium on High Performance Computing 2004 (ISHPC-V), LNCS-2858, pp.342-351, Tokyo, Oct. 2003. &br;&br; +T. Boku, M. Sato, K. Onuma, J. Makino, H. Susa, D. Takahashi, M. Umemura, A. Ukawa, "HMCS-G : grid enabled hybrid computing system for computational astrophysics", Proceedings of Grid and Advanced Network (GAN'03) in CCGrid2003, pp.558-565, Tokyo, May 2003. &br;&br; +M. Sato, T. Boku, D. Takahashi, "OmniRPC: a Grid RPC System for Parallel Programming in Cluster and Grid Environment", Proceedings of CCGrid2003, Tokyo, May 2003. &br;&br; +T. Boku, J. Makino, H. Susa, M. Umemura, T. Fukushige and A. Ukawa, "Heterogeneous Multi-Computer System: A New Paradim of Parallel Processing", Proceedings of 2002 International Conference on Parallel Procesing in Electrical Engineering, Warsaw, Sep. 2002 (invited talk). &br;&br; +T. Boku, J. Makino, H. Susa, M. Umemura, T. Fukushige and A. Ukawa, "Heterogeneous Multi-Computer System: A New Platform for Multi-Paradigm Scientific Simulation", Proceedings of 2002 International Conference on Supercomputing, pp.26-34, New York City, Jun. 2002. &br;&br; +D. Takahashi, M. Sato and T. Boku, "Performance Evaluation of the Hitachi SR8000 Using OpenMP Benchmarks", Proceedings of 4th International Symposium on High Performance Computing (ISHPC 2002), Lecture Notes in Computer Science, No. 2327, pp. 390-400, 2002. &br;&br; +T. Boku, S. Yoshikawa, M. Sato, C. G. Hoover and W. G. Hoover, "Implementation and performance evaluation of SPAM particle code with OpenMP-MPI hybrid programming", Proceedings of European Workshop on OpenMP (EWOMP) 2001, Barcelona, Sep. 2001. &br;&br; +T. Boku, M. Matsubara and K. Itakura, "PIO: Parallel I/O System for\\ Massively Parallel Processors", Proceedings of European High Performance Computing and Network Conference 2001 (LNCS-2110), pp.383-392, Amsterdam, Jun. 2001. &br;&br; +M. Kondo, H. Okawara, H. Nakamura, T. Boku and S. Sakai, "SCIMA: A Novel Processor Architecture for High Performance Computing", Proceedings of HPC Asia'2000, pp.355-360, Beijing, May 2000. &br;&br; +T. Boku, K. Itakura, S. Yoshikawa, M. Kondo and M. Sato, "Performance Analysis of PC-CLUMP based on SMP-Bus Utilization", Proceedings of WCBC'00 (Workshop on Cluster Based Computing 2000), Santa Fe, May 2000. &br;&br; +M. Matsubara, H. Numa, and T. Boku, "Commodity Network based Parallel I/O System for Massively Parallel Processors", Proceedings of PDPTA'99, pp.2424-2429, Las Vegas, Jun. 1999. &br;&br; +T. Boku, M. Mishima, K. Itakura, "VIPPES : A Virtual Parallel Processing System Simulation Environment", Proceedings of HPC Asia'98, pp.843-853, Singapore, Sep. 1998. &br;&br; +M. Matsubara, K. Itakura, T. Boku, "Large Scale Molecular Dynamics Simulations on CP-PACS", Proceedings of HPC Asia'98, pp.321-331, Singapore, Sep. 1998. &br;&br; +K. Kubota, M. Sato, K. Itakura, T. Boku, "Accuracy of fast performance prediction by instrumentation tool EXCIT", Proceedings of HPC Asia'98, pp.1031-1038, Singapore, Sep. 1998. &br;&br; +K. Kutoba, K. Itakura, M. Sato, T. Boku, "Practical Simulation of Large-Scale Parallel Programs and Its Performance Analysis of the NAS Parallel Bechmarks", Proceedings of Euro-Par'98 (LNCS 1470), pp.244-254, Manchester, 1998. &br;&br; +H. Nakamura, K. Itakura, M. Matsubara, T. Boku, K. Nakazawa, "Effectiveness of Register Preloading on CP-PACS Node Processor", Proceedings of Innovative Architecture for Future Generation High-Performance Processors and Systems, pp.83-90, Mauii, Oct. 1997. &br;&br; +T. Boku, K. Itakura, H. Nakamura, K. Nakazawa, "CP-PACS: A massively parallel processor for large scale scientific calculations", Proceedings of ACM International Conference on Supercomputing'97, pp.108-115, Vienna, Jul. 1997. &br;&br; +K. Itakura, T. Boku, H. Nakamura, K. Nakazawa, "Performance evaluation of CP-PACS on CG benchmark", Proceedings of HPC Asia'97, pp.678-683, Seoul, Apr. 1997. &br;&br; +Y. Abei, K. Itakura, T. Boku, H. Nakamura, K. Nakazawa, "Performance Improvement for Matrix Calculation on CP-PACS Node Processor", Proceedings of HPC Asia'97, pp.672-677, Seoul, Apr. 1997. &br;&br; +T. Boku, H. Nakamura, K. Nakazawa, Y. Iwasaki, "The Architecture of Massively Parallel Processor CP-PACS", Proceedings of 2nd pAs, pp.31-40, Aizu, Mar. 1997. &br;&br; +T. Morimoto, K. Saito, H. Nakamura, T. Boku, K. Nakazawa, "Advanced Processor Design Using Hardware Description Language AIDL", Proceedings of Asia and South Pacific Design Automation Conference 1997 (ASP-DAC'97), pp.387-390, Makuhari, Mar. 1997. &br;&br; +A. Murata, T. Boku, T. Harada, H. Amano, "The MDX (Multi-Dimensional X'bar): A class of networks for large scale multiprocessors", Proceedings of 9th International Conference on Parallel and Distributed Computign System (PDCS96), pp.296--303, 1996. &br;&br; +T. Boku, M. Mishima, K. Itakura, H. Nakamura, K. Nakazawa, "VIPPES: A performance pre-evaluation system for parallel processors", Presented at HPCN Europe'96, Brussel, Apr. 1996. &br;&br; +K. Itakura, M. Hattori, T. Boku, H. Nakamura, and K. Nakazawa, "Preliminary evaluation of NAS Parallel Benchmarks on CP-PACS", Proceedings of PERMEAN'95, pp.68-77, Beppu, Aug. 1995. &br;&br; +T. Boku, T. Harada, T. Sone, H. Nakamura, and K. Nakazawa, "INSPIRE : A general purpose network simulator generating system for massively parallel processors", Proceedings of PERMEAN'95, pp.24-33, Beppu, Aug. 1995. &br;&br; +H. Morimoto, K. Yamazaki, H. Nakamura, T. Boku, and K. Nakazawa, "Superscalar Processor Design with Hardware Description Language AIDL", Proceedings of 2nd Asia Pacific Conference on Hardware Description Language, pp.51-58, Nagoya, Oct. 1994. &br;&br; +H. Nakamura, T. Wakabayashi, K. Nakazawa, T. Boku, H. Wada, and Y. Inagami, "Pseudo Vector Processor for High-speed List Vector Computation with Hiding Memory Access Latency", Proceedings of IEEE TENCON'94, pp.338-342, Singapore, Sep. 1994. &br;&br; +H. Nakamura, K. Nakazawa, H. Li, H. Imori, T. Boku, I. Nakata, and Y. Yamashita, "Evaluation of Pseudo Vector Processor based on Slide-Windowed Registers", Proceedings of Hawaii International Conference on System Sciences 27, pp.368-377, Honolulu, Jan. 1994. &br;&br; +H. Nakamura, H. Imori, K. Nakazawa, T. Boku, I. Nakata, Y. Yamashita, H. Wada, and Y. Inagami, "A Scalar Architecture for Pseudo Vector Processing based on Slide-Windowed Registers", Proceedings of ACM International Conference on Supercomputing '93, pp.298-307, Tokyo, Jun. 1993. &br;&br; +T. Boku, A. Murata, and T. Kawai, "Why do experiments and theory disagree on the turbulence transition of the poiseuille flow ?", Proceedings of 4th International Symposium on Computational Fluid Dynamics, pp.121-126, Davis, Sep. 1991. &br;&br; +T. Kimura, T. Boku, T. Kudoh, and H. Amano, "A Concurrent Program Restructuring System for Scientific Calculations", Proceedings of 24th Hawaii International Conference on System Sciences (IEEE/ACM), pp.390-399, Honolulu, Jan. 1991. &br;&br; +T. Boku, S. Nomura, and H. Amano, "IMPULSE: A high performance processing unit for multiprocessors for scientific calculation", Proceedings of 15th International Symposium on Computer Architecture (IEEE/ACM), pp.365-372, Honolulu, Jun. 1988. &br;&br; +T. Boku, T. Kudoh, H. Amano, and H. Aiso, "DIPROS: A distributed processing system for NDL on (SM)^2-II", Proceedings of 20th Hawaii International Conference on System Sciences (IEEE/ACM), pp.208-217, Kona, Jan. 1987. &br;&br; +T. Kudoh, H. Amano, T. Boku, and H. Aiso, "NDL: A language for solving scientific problems on MIMD machines", Proceedings of 1st Super Computing Symposium (IEEE), pp.55-64, Miami, 1985. &br;&br; +H. Amano, T. Boku, T. Kudoh, and H. Aiso, "(SM)^2-II: The new version of the sparse matrix solving machine", Proceedings of 12th International Symposium on Computer Architecture (IEEE/ACM), pp.100-107, 1985. ** Poster Presentations [#s0f5bda5] +Firmansyah Iman, Yoshiki Yamaguchi, Taisuke Boku, “Capability assessment of a multiple-FPGA system for high-performance computing”, HPC in Asia Poster, ISC2016, Frankfurt, Jun. 2016.&br;&br; +Kazuya Matsumoto, Norihisa Fujita, Toshihiro Hanawa, Taisuke Boku, "Implementation and Performance Evaluation of NAS Parallel CG Benchmark on GPU Cluster with Proprietary Interconnect TCA", HPC in Asia Poster, ISC2016, Frankfurt, Jun. 2016.&br;&br; +T. Hanawa, Y. Kodama, T. Boku, M. Sato, "Tightly Coupled Accelerators Architecture for Low-Latency Inter-node Communication between Accelerators", SC14 Poster Session, New Orleans, 2014.&br;&br; +T. Hanawa, Y. Kodama, T. Boku, M. Sato, "Proprietary Interconnect with Low Latency for HA-PACS/TCA", HPC in Asia Session (poster) in Int. Supercomputing Conference (ISC) 2014, Leipzig, 2014. (received Best Poster Awards in HPC in Asia Poster Session)&br;&br; +N. Fujita, H. Nuga, T. Boku Y. Idomura, "Nuclear Fusion Simulation Code Optimization on GPU Clusters", ICPADS2013 (poster), Seoul, 2013.&br;&br; +T. Odajima, T. Boku, T. Hanawa, J. Lee, M. Sato, R. Namyst, S. Thibault, O. Aumage, “Task size control on high level programming for GPU/CPU work sharing”, HPC in Asia Session (poster), ISC2013. &br;&br; +T. Hanawa, Y. Kodama, T. Boku, M. Sato, “HA-PACS/TCA: Tightly Coupled Architectures for low-latency communication among GPUs”, HPC in Asia Session (poster), ISC2013. &br;&br; +H. Umeda, T. Hanawa, M. Shoji, T. Boku, "GPU accelerated Fock matrix preparation in OpenFMO", HPC in Asia Session (poster), ISC2013. &br;&br; +T. Hanawa, T. Boku, S. Miura, M. Sato, K. Arimoto, "PEACH: A Communication SoC for PCI Express Direct Link," IEEE Symposium on Low-Power and High-Speed Chips (COOL Chips XIV), 1 page, 2011 (PDF), Best Feature Award *[[Japanese Papers]] [#qa09d8ef]