Curriculum Vitae and Publications

Career

I am currently working at NVIDIA as a Deep Learning Engineer. Here I develop and engineer deep learning frameworks (esp. TensorFlow) for improving the performance and efficiency on GPUs. I received my Ph.D. at the Department of Computer Science of Virginia Tech. My research topic is the performance portability of various computational kernels on modern accelerators, such as CPUs, MICs and GPUs.

Selected Publications

Here lists some of my publications regarding high performance computing:

  • [PPOPP’19] Hao Wang, Liang Geng, Rubao Lee, Kaixi Hou, Yanfeng Zhang, Xiaodong Zhang. “SEP-Graph: Finding Shortest Execution Paths for Graph Processing under a Hybrid Framework on GPU”. Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Feb 2019. PDF, Code
  • [Doctoral Dissertation] Kaixi Hou. “Exploring Performance Portability for Accelerators via High-level Parallel Patterns”. Virginia Tech, VTechWorks, May 2018. PDF, Slides
  • [TPDS’18] Kaixi Hou, Hao Wang, Wu-chun Feng. “A Framework for the Automatic Vectorization of Parallel Sort on x86-based Processors”. IEEE Transactions on Parallel and Distributed Systems, May 2018. PDF, Code
  • [IPDPS’18] Kaixi Hou, Hao Wang, Wu-chun Feng, Jeffery Vetter, Seyong Lee. “Highly Efficient Compensation-based Parallelism for Wavefront Loops on GPUs”. Proceedings of the 32th IEEE International Parallel and Distributed Processing Symposium, May 2018. PDF, Slides
  • [BigData’17] Xiaodong Yu, Kaixi Hou, Hao Wang, Wu-chun Feng. “Robotomata: A Framework for Approximate Pattern Matching of Big Data on an Automata Processor”. Proceedings of the 5th IEEE International Conference on Big Data, Dec 2017. PDF
  • [SC’17] Ang Li, Weifeng Liu, Mads R. B. Kristensen, Brian Vinter, Hao Wang, Kaixi Hou, Andres Marquez, Shuaiwen Leon Song. “Exploring and Analyzing the Real Impact of Modern On-Package Memory on HPC Scientific Kernels”.
    Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (Best Paper Finalists), Nov 2017. PDF
  • [ICS’17] Kaixi Hou, Weifeng Liu, Hao Wang, Wu-chun Feng. “Fast Segmented Sort on GPUs”. Proceedings of the 31th ACM International Conference on Supercomputing, Jun 2017. PDF,Slides,Code
  • [CF’17] Kaixi Hou, Hao Wang, Wu-chun Feng. “GPU-UniCache: Automatic Code Generation of Spatial Blocking for Stencils on GPUs”. Proceedings of the 14th ACM Computing Frontiers Conference, May 2017. PDF,Slides,Code
  • [AsHES’17] Kaixi Hou, Wu-chun Feng, Shuai Che. “Auto-Tuning Strategies for Parallelizing Sparse Matrix-Vector (SpMV) Multiplication on Multi- and Many-Core Processors”. Proceedings of the 7th International Workshop on Accelerators and Hybrid Exascale Systems, May 2017. PDF,Slides
  • [ICS’16] Hao Wang, Weifeng Liu, Kaixi Hou, Wu-chun Feng. “Parallel Transposition of Sparse Data Structures”. Proceedings of the 30th ACM International Conference on Supercomputing, Jun 2016. PDF,Slides,Code
  • [IPDPS’16] Kaixi Hou, Hao Wang, Wu-chun Feng. “AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-based Multi- and Many-core Processors”. Proceedings of the 30th IEEE International Parallel and Distributed Processing Symposium, May 2016. PDF,Slides,Code
  • [ICS’15] Kaixi Hou, Hao Wang, Wu-chun Feng. “ASPaS: A Framework for Automatic SIMDIZation of Parallel Sorting on x86-based Many-core Processors”. Proceedings of the 29th ACM International Conference on Supercomputing, Jun 2015. PDF,Slides,Code
  • [P2S2’14] Kaixi Hou, Hao Wang, Wu-chun Feng. “Delivering Parallel Programmability to the Masses via the Intel MIC Ecosystem: A Case Study”. Proceedings of the 7th International Workshop on Parallel Programming Models and Systems Software for High-End Computing, Sept 2014. PDF,Slides

Professional Services

  • Program Committee: AsHES (2019), ICPP (2019), EuroPar (2019)
  • Reviewer (Journal): TPDS (2018-19), FGCS (2018-19), Parco (2018-19), CCPE (2019-2020)
  • Reviewer (Conferences & Workshops): ICCD (2016), HPCC (2017), WRAp (2017), AsHES (2018), ICS (2018), CBDCom (2018)