Chin-Yung Chen, Jenn Tang, Dong-Liang Lee, and Jih-Fu Tu
[1] W.C. Fu & J.H. Patel, Data prefetching in multiprocessorvector cache memories, Proc. 18 th Int. Sym. on ComputerArchitecture, May 1991, 54-63. [2] J.L. Baer & T.F. Chen, An effective on-chip preloadingscheme to reduce data access penalty, Proc. Supercomputing‘91,November 1991, 176–186. [3] T.F. Chen & J.L. Baer, Effective hardware-based data prefetching for high-performance processor, IEEE Trans. on Computers, May 1995, 318–328. doi:10.1109/12.381947 [4] N.P. Jouppi & D. Wall, Available instruction-level parallelismfor superscalar and superpipelined machines, Proc. 13 th Int.Conf. on Architectural Support for Programming Languagesand operating system, April 1989, 272–282. doi:10.1145/70082.68207 [5] K. Hwang, Advanced computer architecture parallelism scalability programmability (McGraw-Hill, 1997). [6] J. Smith, Sequential program prefetching in memory hierarchical, IEEE Computer, 47(12), 1978, 7–21. [7] J.-F. Tu, Y.-H. Wang, & L.-H. Wang, A dynamic dataprefetchung method of improving the memory latency, FourthInt. Conf./Exhibition on High Performance Computing inAsia—Pacific Region, Beijing, China, May 14–17, 2000, 13–18. doi:10.1109/HPC.2000.846508 [8] G. Doshi, R. Krishnaiyer, & K. Muthukumar, Optimizingsoftware data prefetches with rotating registers, Proc. 2001 Int.Conf. on Parallel Architectures and Compilation Techniques,2001, 257–267. doi:10.1109/PACT.2001.953306 [9] D.J. Lilja & S.P. Vander Wiel, A compiler-assisted data prefetchcontroller, Int. Conf. on Computer Design (ICCD ’99), 1999,372–377. [10] J.L. Hennessy & D. Patterson, Computer architecture: Aquantitative approach, 2nd ed. (Morgan Kaufmann, 1996). [11] SPEC organization, The user guide of SPEC CPU95, 1996. [12] P. Cao, E. Felten, & K. Li, Implementation and performanceof application-controlled file caching, prefetching, and diskscheme, ACM Trans. on Computer Systems, November 1996, 217–234. [13] J. Smith & J. Lee, Branch prediction strategies and branchtarget buffer design, IEEE Trans. on Computer, 17(1), 1984,6–22. [14] N.P. Jouppi, Improving direct-mapped cache performance bythe addition of a small fully-associative cache and performance buffer, Proc. 17 th Annual Int. Symp. on Computer Architecture, May 1990, 364–373. doi:10.1109/ISCA.1990.134547 [15] S. Palacharla & R.E. Kessler, Evaluating stream buffers as asecondary cache replacement, Proc. 21 st Annual Int. Symp.on Computer Architecture, April 1994, 24–33. [16] T.-Y. Yeh & Y.N. Patt, Alternative implementations of two-level adaptive branch prediction, 19th Annual Int. Symp. ofComputer Architecture, Gold Coast, Australia, May 1992, 124–134. doi:10.1109/ISCA.1992.753310 [17] B. Calder & D. Grunwald, Next cache line and set prediction,22nd Annual Int. Symp. on Computer Architecture, Santa Margherita Ligure, Italy, June 1995, 287–297. doi:10.1145/223982.224439 [18] S.S. Pinter, Tango: A hardware-based data prefetching technique for superscalar processors, 17th Annual Int. Symp. ofComputer Architecture, San Diego, CA, May 1995, 214–225. doi:10.1109/MICRO.1996.566463 [19] D. Joseph & D. Grunwald, Prefetching using Markov predictors, IEEE Trans. on Computers, 48(2), 1999, 121-133. doi:10.1109/12.752653
Important Links:
Go Back