學術產出-Theses
Article View/Open
Publication Export
-
題名 通過分數之標準設定的研究 作者 鄭明長
ZHENG, MING-CHANG貢獻者 余民寧
YU, MIN-NING
鄭明長
ZHENG, MING-CHANG關鍵詞 通過分數
標準設定
分類
教育日期 1993
1992上傳時間 2-May-2016 15:27:12 (UTC+8) 摘要 本研究之目的有二:(一)探討基於試題反應理論的設定通過分數之標準的可能方法 參考文獻 余民寧(民80 )試題反應理論的介紹(一):測驗理論的發展趨勢 。 研習資訊, 8 卷( 6 期) , 13-18 頁。 余民寧(民81a )試題反應理論的介紹(二):基本概念與假設。 研習資訊, 9 卷( 1 期) , 5-9 頁。 余民寧(民81b )試題反應理論的介紹(三):試題反應模式及其 特性。研習資訊, 9 卷( 2 期) , 6-10 頁。 余民寧(民81c )測驗理論的發展趨勢。政治大學心理研究所主辦: 心理測驗之學術及實務研討會論文。 余民寧(民81d )試題反應理論的介紹(六):能力量尺。 研習資 訊, 9 卷( 5 期) , 8-12 頁。 余民寧(民81e )試題反應理論的介紹(七):訊息函數。 研習資 訊, 9 卷( 6 期) , 5 - 9 頁。 吳裕益(民75 )標準參照測驗通過分數設定方法之研究。政大教研 所博士論文,未出版。 林惠芬(民82 )通過分數設定方法在護理人員按竅筆試測驗之研究 。測驗年刊, 40 輯, 253-262 頁。 許擇基、劉長萱(民81) 試題作答理論簡介。 臺北:中國行為科學社。 郭生玉(民74) 心理與教育測驗。 臺北:精華。 陳英豪、吳裕益(民75 )新舊測驗理論之比較及其應用。 臺南師專 學報,19 期~ 253-290 頁。 Andrew, B. J. & Hecht, J. (1976). A preliminary investigation of two procedures for setting examination standards. Educational and Psychological Measurement, 36,45-50. Angoff, W. H. (1971). Scales, norms, and equivalent scorea. In R. L. Thorndike (Ed.), Educational Measurement(pp.508 -600). Washington, D. C.: American Council on Education. Beaton, A. E., & Allen, N. L. through scale anchoring. Statistics, !2, 191-204. (1992). Interpreting scales Journal of Educational Behuniak, P. JR., Archambault, F. X., & Gable, R. K. (1982). Angoff and Nedelsky standard setti ng procedures: implications for the validity of Pr oficiency test score interpretation. Educational and Psychological measurement, 42, 247-255. Berk, R. A. (1986). A consumer`s guide to setting performance stsndards on criterion-referenced tests. Review of Educational Research, 56(1), 137-172. Berk, R. A. (1976). Determination of optiomal cutting scores In criterion-referenced measurement. Experimental Education, 45, 4-9. Journal of Beuk, C. H. (1984). A method for reaching a compromlse between absolute and relative standards in examinations. Journal of Educational Measurement, 21,147-152. Birnbaum, A. (1968). Estimation of an ability. In F. M. Lord and M. R. Novick, Statistical theories of mental test scores (chapters 20). Reading, MA: Addison-Wesley. Block, J. H. (1971). Critertion-referced measurements: Potential. Shool Review,69,289-298. Block, J. N. (1972). Student learning and the setting of mastery performance standards. Educational Horizons, 50, 183-190. Block, J. H. (1978). Standards and criteria: A respose. Journal of Education Measurement, 15, 291-295. Brennan, R. L., & Locb`JQod, R. E. (1980). A comparlson of the Nedelsky and Angoff cutting score procedures using Generalizability theory. Applied psychological measurement, 4, 219-240. Burton, N. W. (1978). Societal standards.Journal of Educational Measurement, 15,263-271. Cascio, W. F., Alexander, R. A., & Barrett, G. V. (1988). Setting cutoff scores: Legal, psychometric, and professional lssues and guidelines. Personnel Psychology, 41, 1-24. Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. New York: Bolt, Rinehart & Winston. Cross, L. B., Impara, J. C., Frary, R. B., & Jaeger, R. M. (1984). A comparison of three methods for establishing minimum standards on the national teacher examinationa. Journal of Education Measurement, 21, 113-129. Davis, F. B., Diamond, J. J. (1974). The preparation of criterion-referenced tests. In C. W. Barris, M. C. Alkin, & W. J. Popham. (Eds.), Problems ln criterion referenced measurement. Los &ngeles: UCLA Graduate school of Education, Center for the study of Evaluation. de Gruijter, D. N. M., & Bambleton, R. K. (1984). On problems encountered using decision theory to set cutoff scores. Applied Psychological Measurement, 8, 1- 8. Ebel, R. L.(1971). Critertion-referced measurements: Limitation. Shool Review,69,282-288. Ebe1, R. L. (1972). Essentials of Educational Measurement. Englewood. Cli££s, N. J.:Prentice-Hall. Ebel, R. L. (1978). The case for minimum competency testing. phi Delta Kappan, April, 546-549. Ebel, R. L. (1979). Essentials of Educational Mensurement (3rd ed.). Englewood Cli£fs, NJ: Prentice-flaIl. Emrick, J. A. (1971). An evaluation model for mastery testing. Journal of Educational Measurement, ~(4), 321-326. Gagn`e, R. M. (1985). The conditions of Learning and theory of instruction. New York: Holt, Rinechart & Winston. Garcia-Quintana, R. A., & Mappus, L. L. (1980). Using norm-referenced data to set standards for a minimum competency program III the stats of South Carolina: Aieasibility study. Educational Evaluation and Policy Analysis,~, 47-52. Glass, G. V. (1978). Standards and criteria. Journal of Educational Measurement, 15(4), 237-261. Glaser, R. (1963). Instructional technology and the measurement of learning outcomes . American Psychologist, 18, 519-521. Glaser, R., & Klaus, D. J. (1962). Proficiency measurement: Assessing human performance. In R. M. Gagne` (Ed), Psychological Principles in Systems Development(pp.419- 474). New York: Holt, Rinhart and Winston. Glaser, R. & Nitko,A J.(1971).Measurement in learning and instruction. In R. L. Thorndike (Ed.), Educational measurement(pp.625-670). Washington: American Council on Education, Guion, R. M., & Ironson, G. H. (1983). Latent trait theory for organizational research. Organizational Behavior and Human Performance, 31,54-87. Haladyna, T. M., & Roid, G. H. (1983). A cornparlBon of two approaches to criterion-referenced test construction. Journal of Educational Measurement, 20,271-281. Halpin, G., Sigmon, G., competency standards & Halpin, set by G. (1983). Minimum three judgmental procedures:implications for validity. Educational and psychological measurement , 43,185- 196. Hambleton, R. K. (1978). On the use of cut- off scores with criterion- referenced tests in instructional settings. Journal of Educational Measurement, 15(4), 277-290. Hambleton, R. K. (1979) Latent trait models and their applications. In R. T. Guest, (Ed), Methodological developments. Washington: Jossey-Bass. Hambleton, R. K.(1980). Test score validity and standars-setting methods. In R. A. Berk, (Ed.), Criterion-referenced Measurement: The state of the art(pp.80-128). Baltimore, MD:John Hopkins University Press. Hambleton, R. K. (1983). Application of item response models to criterion referenced assessment. Applied psychological Measurement, 7, 33-44. Hambleton, R. K. (1989). Principles and selected applications of item response theory. In R. L. Linn (ED.), Educational measurment (3rd ed., pp. 147-200). New York: Macmillan. Hambleton, R. K. (1990). Criterion referenced-testing methods and practices. In T. B. Gutkin & C. R.Reynolds (Eds.), The handbook of school psychology (pp. 388-415). New Jork:John Wiley & Sons. Hambleton, R. K.,Algina, J., & coulson, D. S. (1978). criterion- referenced testing and measurement:A review of technical issues and developments. Review of Educational Research, 48, 1- 47. Hambleton, R. K., & Cook, L. L. (1977). Latent trait models and their use in the analysis of educational test data. Journal of Educational Measurement, 14,75 -96. Hambleton, R. K., & de Application of criterion-referenced Gruijter, D. N. item test response selection. Educational Measurement, 20, 355-367. M. (1983). model to Journal of Hambleton, R. K., & Eignor, D. R. (1978). Guidelines for evaluating criterion-referenced tests and test manuals. Journal of Educational Measurement, 15,321-327. Hambleton, R. K., & Eignor, D. R. (1980). Competency test development ,validation,and standard setting. In R. M. Jaeger & C. K. Tittle (Eds.), Minimum Competency Achievement Testing: Motives, models, measures, and consequences(pp.367-396). Berkeley, CA.: McCutchan. Hambleton, R. K., Mills, C. N. & Simon, R. (1983). Determining the lengths for criterion- referenced tests. Journal of Educational Measurement, 20, 27-38. Hambleton, R. K., & Novick, M. R. (1973). Toward an integration of theory and method for criterion-referenced tests. Journal of Education Measurement, 10,159-170. Hambleton, R. K., swaminathan, H., Algin a , J., & Coulson, D. S. (1978). Criterion-referenced testing and measurement: A review of technical issues and developments. Review of Educational research, 48, 1-47. Hambleton, R. K., & Swaminathan, H. (1985). Item response theory : Principles and applications. Boston, Ma: Kluwer-Nijhoff. Hambleton, R. K., Swaminthan, H. & Rogers, H. J. (1991). Fundamentals of item response theory. Newburry Park, CA: SAGE. Harasym, P. H. (1981). A comparison of the Nedelsky and modified Angoff standard-setting procedure on evaluation outcome. Educational and Psycholoical Measurement, 41,725-734. Harris, C. W., (1972).An interpretation of Livingston`s reliability coefficient for criterion-referenceed tests. Journal of Educational Measurement, 9, 27-29. Harris, D.J., & Subkoviak, M. J. (1986). Item analysis: A short-cut statisitic for mastery tests. Educational and Psychological Measurement, 46, 494-507. Hu li l1; L. L., Drasgm`J, F., & Parsons, C. K. (1983). Item response theory: Application to psychological measurement. Homewood, IL: Dow Jones- Irwin. Huynh, H. (1976). On the reliability of decisions In domain-referenced testing. Journal of Educational Measurement, 13, 253-264. Huynh, E. (1978). Reliability of mutiple classifications. Psychmetrika, 45, 317-325. Euynh, E. (1985). Assessing Mastery of basic skills through summative testing. In D. V. Levine, (Ed), Improving student achievement through mastery learning programs. San Francisco, Califoenia: Jossey-Bass. Euynh, E., & Castel, J. (1985). A comparslon of the mllllmax and Rasch approaches to set simultaneous- passlllg scores for subtests. Journal of Education Statistics, 10, 334-344. Jaeger, R. M. (1991). Selection of judges for standard-setting. Educational Measurement: Issues and Practice, 10(2), 3-6. Jaeger, R. M. (1989). Certification of student competence. In R. L. Linn (ED.), Educational rneasurment (3rd ed., pp. 147-200). New York: Macmillan. Jaeger, R. M. (1982). An iterative structured judgment process for establishing standards on competency tests: Theory and application Educational Evaluation and Policy Analysis, 4, 461-476. Kane, M. T. (1987). On the use of IRT models with judgemental standard setting procedures. Journal of Educational Measurement, 24, 333-345. Koffler, S. L. (1980). A comparlson of approaches for setting proficiency standards. Journal of Educational measurement, li, 167-178. Kriewal, T. E. (1972). Aspects and applications of criterion-referenced tests. I.llinois school research, 9, 5-18. Levin, H. M. (1978). Educational performance standards:lmage or substance? Journal of Educational Measurement, 15, 309-319. Livingston, S. A. (1975). A utility-based approach to the evaluation of pass/fall testing decision procedures ( Rep. No. Copa-75-01). Princeton, NJ: Center for Occupational and Professional Assessment, Educational Testing Sevice. Livingston, S. A. (1980). Choosing minimum pessing score by stochastic approximation techniques. Education and Psychological Measurement, 40, 859-873. Livingston, S. A., & Zieky, M. J. ("1982). Manual for setting standards on the basic skills assessment tests. Princeton, N. J.: Educational Testing Service. Lord, F. M. (1980).Applications of item response theory to practical test problem. Hillsdale, NJ: Lawrence Erlbaum associates. (P86未key) Mislevy, R. J., & Bock, R. D. (1983). BILOG: Item analysis and test with binary logistic models. Mooresville IN: Scientific Software , Inc. Mislevy, R. J., Johnson, E. G., & Muraki, E. (1992). Sclaing procedures in N~. Journal of Education statistics, 17, 131-154. Mislevy, R. J. & Stocking, M; L. (1989). A consumer`s guide to LOGIST and BILOG. Applied Psychological Measurement, 13, 57-75. Nedelsky, L. (1954). Absolute grading standards for objective tests. Educational and Psychological Measurement, 14, 3-19 . Norcini, J. J., Lipner, R. S., Langdon, L. 0., & Strecker, C. A. (1987). A comparlSon of three variations on a standard-setting method. Journal of Educational Measurement, 24, 56-64. Novick , M. R. , & Lewis ; C. (1974)= Prescribing test length for estimation criterion-referenced measurement. In C. w. Harris, M. C. Alkin, & W. J. Popham (Eds.), Problems in criterion-referenced measurement(CSE Monograph Series in Evaluation, No.3, pp. 139-158). Los Angeles: Center for the Study of Evaluation, University of California. Novick, M. R., Lewis, C., & Jackson, P. H. (1973). The estimation of proportions in m groups. Psychometrika, 38, 19-46. Peng, C.-Y. J., & Subkoviak, M. J. (1980). A note on Huynh`s nomal approximation procedure for estimating criterion-referenced reliability. Journal of Educational Measurement, 10(2), 359-368. Plake, B. S., Melican, G. J., & Mills, c. N. (1991). Factore influencing intrajudge consistency during standard-setting. Educational Measurement: Issues and practice, 10(2), 15-16,22. Plake, B. S., & Kane, M. T. (1991). Comparison of method for combining the minimum passing levels for individual item into a passing. Journal of Educational Measurement, 28, 249-256. Popham, W. K., & Husek, T. R. (1969). Implications of criterion- referenced measurement. Journal of Educational Measurement,6~, 1-9. Popham, W. J. (1978). As always, provocative. Journal of Educational Measurement, 15, 297-300. Popham,W.J.(1981). Modern educational measurement. Prentic-hall. Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests. Chicago: The Oniversity of Chicago Press (Or iginal edition was published in 1960). Reid, J. B. (1991). Training judges to generate standard-setting data. Educational Measurement: Issues and practice, 10(2), 11-14. Rowley, G. L.(1982). Historical antecedents of the standard-setting debate: An inside account of the minimal-beardedness controversy. Journal of Educational Measurement, 19,87-95. hannon, G. A., & Cliver, B. A. (1987). An application of item response theory in the compariaon of four conventional item discrimination indices for criterion-referenced tests. Journal of Educational Measurement, 24, 347-356. aunders, J. C., Ryan, J.P., & Huynh, H. (1981). A comparison of two approaches to setting passing scores based on the nedelsky procedure. Applied Psychological Measurement, 5, 209-217 • . 1epard, L. (1980). Technical issures in minimum competence testing. In D. C. Berlinger(Ed.), Review of research In education (Vol. 8). Itasca, Illinois: F.E. Peacock. lepard, L. A. (1984). setting performance standards. In R. A. Berk (Ed), A guide to criterion-referenced test construction (pp.169-198). Baltimore, MD: Johns Hopkins University Press. Skakun, E. N., & Kling, S. (1980). Comparablity of methods for setting standards. Journal of Educational Measurement, 17, 229-235. Smith, R. L., & Smith, J. K. (1988). Di££erential use of item in£ormation by judges ueing Angoff and Nedelsky procedures. Jorn::nal of Educational Measurement, 25,259-285. Subkoviak, M. J. (1976). Estimating reliability from a single administraion of a criterion-referenced test. Journal of Educational Measurement, 13/265-276. Subkoviak, M. J.(1978). Empirical investigation of procedures for estimating reliability for mastery tests. Journal of Educational Measurement, 15, 111-115. Subkoiak, M. J. (1980). Decision-consistency appoaches. In R. A. Berk, (Ed.), criterion-referenced Measurement:The state of the art(pp . 129-185) . Baltimore, Md . : Johns University Press. ubkoviak, M. J. (1988). A practitioner`s guide to computation and interpretation of reliability indices for mastery tests. Journal of Educational Measurement, 25, 47-55. waminathan, H., Hambleton, R. K., & Algina, J. (1975). A Bayesian Decision-theoretic procedure for use with criterion-referenced tests. Journal of Educational Measurement, 12, 87-98. hissen, D. & steinberg, L.(1986). A taxonommy of item response models. Psychmetrika, 51, 567-577. an der Linden, W. J.(1978). Forgetting, guesslng, and mastery: The Macready and Dayton models revisited and compared with a latent trait approach. Journal of Educational Statistics, 3, 305-317. an der Linden, W. J. (1981). A latent trait look at pretest-posttest validation of criterion-referenced test items. Review of Educational Research, 51, 379-402. In der Linden, W. J. (1982). A latent trait method for determining intermining intra judge inconsistency in the Angoff and Nedelsky techniques of standard setting. Journal of Educational Measurement, 19, 295-308. an der Linden, W. J. (1984). Some thoughts on the use of decision theory to set cutoff scores: Comment on de Gruijter and Hambleton. Applied Psychological Measurement, 8, 9-17. rm, T. A. (1978). Aprimer of item response theory. Springfield, VA: National Technical Information Service. lilcox, R. R. (1979). Prediction analysis and the reliability of a mastery test. Educational and Psychological Measurement, 39, 825-839. oehr, D. J., Arthur, W. JR., & Fehrmann, M. L. (1991). An empirical comparlson of cutoff score method for content-related and criterion-related validity settings. Educational and Psychological Mea surement, 51, 1029-1039. reight, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14, 97-166. right, B. D., & Stone, M. H. (1979). Best test design. Chicago: MESA Press. en, W. M. (1987). A comparison of the efficiency and accuracy of BILOG and LOGIST. Psychometrika, 52, 275-291. _eky, M. J., & Livingston, S. A. (1977). Manual for setting standards on the basic skills assessment tests. Princeton, NJ: Educational testing service . 描述 碩士
國立政治大學
教育學系資料來源 http://thesis.lib.nccu.edu.tw/record/#B2002004311 資料類型 thesis dc.contributor.advisor 余民寧 zh_TW dc.contributor.advisor YU, MIN-NING en_US dc.contributor.author (Authors) 鄭明長 zh_TW dc.contributor.author (Authors) ZHENG, MING-CHANG en_US dc.creator (作者) 鄭明長 zh_TW dc.creator (作者) ZHENG, MING-CHANG en_US dc.date (日期) 1993 en_US dc.date (日期) 1992 en_US dc.date.accessioned 2-May-2016 15:27:12 (UTC+8) - dc.date.available 2-May-2016 15:27:12 (UTC+8) - dc.date.issued (上傳時間) 2-May-2016 15:27:12 (UTC+8) - dc.identifier (Other Identifiers) B2002004311 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/89462 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 教育學系 zh_TW dc.description.abstract (摘要) 本研究之目的有二:(一)探討基於試題反應理論的設定通過分數之標準的可能方法 zh_TW dc.description.tableofcontents 第一章 緒 論. . . . . . . . . . . . . . . . . . . . .1 第一節 研究動機與目的. . . . . . . . . . . . . . . . .1 第二節 研究問題. . . . . . . . . . . . . . . . . . 8 第三節 研究範圍與限制. . . . . . . . . . . . . . . . 9 第四節 重要名詞定義. . . . . . . . . . . . . . . . 10 第二章 文獻探討. . . . . . . . . . . . . . . . . . .12 第一節 通過分數設訂定相關概念. . . . . . . . . . . . . 12 第二節 各種標準設定方法之探討. . . . . . . . . . . . 19 第三節 通過分數之標準設定方法的比較. . . . . . . . . . 30 第四節 試題反映理論及其在設定通過分數上的應用. . . . . . 35 第五節 分類決定的信度評估. . . . . . . . . . . . . . .44 第三章 研究方法. . . . . . . . . . . . . . . . . . .48 第一節 研究假設. . . . . . . . . . . . . . . . . .48 第二節 研究對象與抽樣. . . . . . . . . . . . . . . 49 第三節 研究工具與研究資料. . . . . . . . . . . . . 50 第四節 資料處理與統計分析. . . . . . . . . . . . . 52 第四章 研究結果與討論. . . . . . . . . . . . . . . . 53 第一節 各種設定方法的初步分類結果. . . . . . . . . . .53 第二節 各種設定方法間的比較. . . . . . . . . . . . .58 第三節 綜合討論. . . . . . . . . . . . . . . . .68 第五章 結論與建議. . . . . . . . . . . . . . . . . .71 第一節 結論. . . . . . . . . . . . . . . . . . 71 第二節 建議. . . . . . . . . . . . . . . . . . 72 參考文獻. . . . . . . . . . . . . . . . . . . . . .75 附錄一 標準測試題本. . . . . . . . . . . . . . . . . 93 附錄二 自編測驗. . . . . . . . . . . . . . . . . . 99 附錄三 教師判斷表. . . . . . . . . . . . . . . . . .103 附錄四 兩份測驗之因素分析結果. . . . . . . . . . . . .104 zh_TW dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#B2002004311 en_US dc.subject (關鍵詞) 通過分數 zh_TW dc.subject (關鍵詞) 標準設定 zh_TW dc.subject (關鍵詞) 分類 zh_TW dc.subject (關鍵詞) 教育 zh_TW dc.title (題名) 通過分數之標準設定的研究 zh_TW dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) 余民寧(民80 )試題反應理論的介紹(一):測驗理論的發展趨勢 。 研習資訊, 8 卷( 6 期) , 13-18 頁。 余民寧(民81a )試題反應理論的介紹(二):基本概念與假設。 研習資訊, 9 卷( 1 期) , 5-9 頁。 余民寧(民81b )試題反應理論的介紹(三):試題反應模式及其 特性。研習資訊, 9 卷( 2 期) , 6-10 頁。 余民寧(民81c )測驗理論的發展趨勢。政治大學心理研究所主辦: 心理測驗之學術及實務研討會論文。 余民寧(民81d )試題反應理論的介紹(六):能力量尺。 研習資 訊, 9 卷( 5 期) , 8-12 頁。 余民寧(民81e )試題反應理論的介紹(七):訊息函數。 研習資 訊, 9 卷( 6 期) , 5 - 9 頁。 吳裕益(民75 )標準參照測驗通過分數設定方法之研究。政大教研 所博士論文,未出版。 林惠芬(民82 )通過分數設定方法在護理人員按竅筆試測驗之研究 。測驗年刊, 40 輯, 253-262 頁。 許擇基、劉長萱(民81) 試題作答理論簡介。 臺北:中國行為科學社。 郭生玉(民74) 心理與教育測驗。 臺北:精華。 陳英豪、吳裕益(民75 )新舊測驗理論之比較及其應用。 臺南師專 學報,19 期~ 253-290 頁。 Andrew, B. J. & Hecht, J. (1976). A preliminary investigation of two procedures for setting examination standards. Educational and Psychological Measurement, 36,45-50. Angoff, W. H. (1971). Scales, norms, and equivalent scorea. In R. L. Thorndike (Ed.), Educational Measurement(pp.508 -600). Washington, D. C.: American Council on Education. Beaton, A. E., & Allen, N. L. through scale anchoring. Statistics, !2, 191-204. (1992). Interpreting scales Journal of Educational Behuniak, P. JR., Archambault, F. X., & Gable, R. K. (1982). Angoff and Nedelsky standard setti ng procedures: implications for the validity of Pr oficiency test score interpretation. Educational and Psychological measurement, 42, 247-255. Berk, R. A. (1986). A consumer`s guide to setting performance stsndards on criterion-referenced tests. Review of Educational Research, 56(1), 137-172. Berk, R. A. (1976). Determination of optiomal cutting scores In criterion-referenced measurement. Experimental Education, 45, 4-9. Journal of Beuk, C. H. (1984). A method for reaching a compromlse between absolute and relative standards in examinations. Journal of Educational Measurement, 21,147-152. Birnbaum, A. (1968). Estimation of an ability. In F. M. Lord and M. R. Novick, Statistical theories of mental test scores (chapters 20). Reading, MA: Addison-Wesley. Block, J. H. (1971). Critertion-referced measurements: Potential. Shool Review,69,289-298. Block, J. N. (1972). Student learning and the setting of mastery performance standards. Educational Horizons, 50, 183-190. Block, J. H. (1978). Standards and criteria: A respose. Journal of Education Measurement, 15, 291-295. Brennan, R. L., & Locb`JQod, R. E. (1980). A comparlson of the Nedelsky and Angoff cutting score procedures using Generalizability theory. Applied psychological measurement, 4, 219-240. Burton, N. W. (1978). Societal standards.Journal of Educational Measurement, 15,263-271. Cascio, W. F., Alexander, R. A., & Barrett, G. V. (1988). Setting cutoff scores: Legal, psychometric, and professional lssues and guidelines. Personnel Psychology, 41, 1-24. Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. New York: Bolt, Rinehart & Winston. Cross, L. B., Impara, J. C., Frary, R. B., & Jaeger, R. M. (1984). A comparison of three methods for establishing minimum standards on the national teacher examinationa. Journal of Education Measurement, 21, 113-129. Davis, F. B., Diamond, J. J. (1974). The preparation of criterion-referenced tests. In C. W. Barris, M. C. Alkin, & W. J. Popham. (Eds.), Problems ln criterion referenced measurement. Los &ngeles: UCLA Graduate school of Education, Center for the study of Evaluation. de Gruijter, D. N. M., & Bambleton, R. K. (1984). On problems encountered using decision theory to set cutoff scores. Applied Psychological Measurement, 8, 1- 8. Ebel, R. L.(1971). Critertion-referced measurements: Limitation. Shool Review,69,282-288. Ebe1, R. L. (1972). Essentials of Educational Measurement. Englewood. Cli££s, N. J.:Prentice-Hall. Ebel, R. L. (1978). The case for minimum competency testing. phi Delta Kappan, April, 546-549. Ebel, R. L. (1979). Essentials of Educational Mensurement (3rd ed.). Englewood Cli£fs, NJ: Prentice-flaIl. Emrick, J. A. (1971). An evaluation model for mastery testing. Journal of Educational Measurement, ~(4), 321-326. Gagn`e, R. M. (1985). The conditions of Learning and theory of instruction. New York: Holt, Rinechart & Winston. Garcia-Quintana, R. A., & Mappus, L. L. (1980). Using norm-referenced data to set standards for a minimum competency program III the stats of South Carolina: Aieasibility study. Educational Evaluation and Policy Analysis,~, 47-52. Glass, G. V. (1978). Standards and criteria. Journal of Educational Measurement, 15(4), 237-261. Glaser, R. (1963). Instructional technology and the measurement of learning outcomes . American Psychologist, 18, 519-521. Glaser, R., & Klaus, D. J. (1962). Proficiency measurement: Assessing human performance. In R. M. Gagne` (Ed), Psychological Principles in Systems Development(pp.419- 474). New York: Holt, Rinhart and Winston. Glaser, R. & Nitko,A J.(1971).Measurement in learning and instruction. In R. L. Thorndike (Ed.), Educational measurement(pp.625-670). Washington: American Council on Education, Guion, R. M., & Ironson, G. H. (1983). Latent trait theory for organizational research. Organizational Behavior and Human Performance, 31,54-87. Haladyna, T. M., & Roid, G. H. (1983). A cornparlBon of two approaches to criterion-referenced test construction. Journal of Educational Measurement, 20,271-281. Halpin, G., Sigmon, G., competency standards & Halpin, set by G. (1983). Minimum three judgmental procedures:implications for validity. Educational and psychological measurement , 43,185- 196. Hambleton, R. K. (1978). On the use of cut- off scores with criterion- referenced tests in instructional settings. Journal of Educational Measurement, 15(4), 277-290. Hambleton, R. K. (1979) Latent trait models and their applications. In R. T. Guest, (Ed), Methodological developments. Washington: Jossey-Bass. Hambleton, R. K.(1980). Test score validity and standars-setting methods. In R. A. Berk, (Ed.), Criterion-referenced Measurement: The state of the art(pp.80-128). Baltimore, MD:John Hopkins University Press. Hambleton, R. K. (1983). Application of item response models to criterion referenced assessment. Applied psychological Measurement, 7, 33-44. Hambleton, R. K. (1989). Principles and selected applications of item response theory. In R. L. Linn (ED.), Educational measurment (3rd ed., pp. 147-200). New York: Macmillan. Hambleton, R. K. (1990). Criterion referenced-testing methods and practices. In T. B. Gutkin & C. R.Reynolds (Eds.), The handbook of school psychology (pp. 388-415). New Jork:John Wiley & Sons. Hambleton, R. K.,Algina, J., & coulson, D. S. (1978). criterion- referenced testing and measurement:A review of technical issues and developments. Review of Educational Research, 48, 1- 47. Hambleton, R. K., & Cook, L. L. (1977). Latent trait models and their use in the analysis of educational test data. Journal of Educational Measurement, 14,75 -96. Hambleton, R. K., & de Application of criterion-referenced Gruijter, D. N. item test response selection. Educational Measurement, 20, 355-367. M. (1983). model to Journal of Hambleton, R. K., & Eignor, D. R. (1978). Guidelines for evaluating criterion-referenced tests and test manuals. Journal of Educational Measurement, 15,321-327. Hambleton, R. K., & Eignor, D. R. (1980). Competency test development ,validation,and standard setting. In R. M. Jaeger & C. K. Tittle (Eds.), Minimum Competency Achievement Testing: Motives, models, measures, and consequences(pp.367-396). Berkeley, CA.: McCutchan. Hambleton, R. K., Mills, C. N. & Simon, R. (1983). Determining the lengths for criterion- referenced tests. Journal of Educational Measurement, 20, 27-38. Hambleton, R. K., & Novick, M. R. (1973). Toward an integration of theory and method for criterion-referenced tests. Journal of Education Measurement, 10,159-170. Hambleton, R. K., swaminathan, H., Algin a , J., & Coulson, D. S. (1978). Criterion-referenced testing and measurement: A review of technical issues and developments. Review of Educational research, 48, 1-47. Hambleton, R. K., & Swaminathan, H. (1985). Item response theory : Principles and applications. Boston, Ma: Kluwer-Nijhoff. Hambleton, R. K., Swaminthan, H. & Rogers, H. J. (1991). Fundamentals of item response theory. Newburry Park, CA: SAGE. Harasym, P. H. (1981). A comparison of the Nedelsky and modified Angoff standard-setting procedure on evaluation outcome. Educational and Psycholoical Measurement, 41,725-734. Harris, C. W., (1972).An interpretation of Livingston`s reliability coefficient for criterion-referenceed tests. Journal of Educational Measurement, 9, 27-29. Harris, D.J., & Subkoviak, M. J. (1986). Item analysis: A short-cut statisitic for mastery tests. Educational and Psychological Measurement, 46, 494-507. Hu li l1; L. L., Drasgm`J, F., & Parsons, C. K. (1983). Item response theory: Application to psychological measurement. Homewood, IL: Dow Jones- Irwin. Huynh, H. (1976). On the reliability of decisions In domain-referenced testing. Journal of Educational Measurement, 13, 253-264. Huynh, E. (1978). Reliability of mutiple classifications. Psychmetrika, 45, 317-325. Euynh, E. (1985). Assessing Mastery of basic skills through summative testing. In D. V. Levine, (Ed), Improving student achievement through mastery learning programs. San Francisco, Califoenia: Jossey-Bass. Euynh, E., & Castel, J. (1985). A comparslon of the mllllmax and Rasch approaches to set simultaneous- passlllg scores for subtests. Journal of Education Statistics, 10, 334-344. Jaeger, R. M. (1991). Selection of judges for standard-setting. Educational Measurement: Issues and Practice, 10(2), 3-6. Jaeger, R. M. (1989). Certification of student competence. In R. L. Linn (ED.), Educational rneasurment (3rd ed., pp. 147-200). New York: Macmillan. Jaeger, R. M. (1982). An iterative structured judgment process for establishing standards on competency tests: Theory and application Educational Evaluation and Policy Analysis, 4, 461-476. Kane, M. T. (1987). On the use of IRT models with judgemental standard setting procedures. Journal of Educational Measurement, 24, 333-345. Koffler, S. L. (1980). A comparlson of approaches for setting proficiency standards. Journal of Educational measurement, li, 167-178. Kriewal, T. E. (1972). Aspects and applications of criterion-referenced tests. I.llinois school research, 9, 5-18. Levin, H. M. (1978). Educational performance standards:lmage or substance? Journal of Educational Measurement, 15, 309-319. Livingston, S. A. (1975). A utility-based approach to the evaluation of pass/fall testing decision procedures ( Rep. No. Copa-75-01). Princeton, NJ: Center for Occupational and Professional Assessment, Educational Testing Sevice. Livingston, S. A. (1980). Choosing minimum pessing score by stochastic approximation techniques. Education and Psychological Measurement, 40, 859-873. Livingston, S. A., & Zieky, M. J. ("1982). Manual for setting standards on the basic skills assessment tests. Princeton, N. J.: Educational Testing Service. Lord, F. M. (1980).Applications of item response theory to practical test problem. Hillsdale, NJ: Lawrence Erlbaum associates. (P86未key) Mislevy, R. J., & Bock, R. D. (1983). BILOG: Item analysis and test with binary logistic models. Mooresville IN: Scientific Software , Inc. Mislevy, R. J., Johnson, E. G., & Muraki, E. (1992). Sclaing procedures in N~. Journal of Education statistics, 17, 131-154. Mislevy, R. J. & Stocking, M; L. (1989). A consumer`s guide to LOGIST and BILOG. Applied Psychological Measurement, 13, 57-75. Nedelsky, L. (1954). Absolute grading standards for objective tests. Educational and Psychological Measurement, 14, 3-19 . Norcini, J. J., Lipner, R. S., Langdon, L. 0., & Strecker, C. A. (1987). A comparlSon of three variations on a standard-setting method. Journal of Educational Measurement, 24, 56-64. Novick , M. R. , & Lewis ; C. (1974)= Prescribing test length for estimation criterion-referenced measurement. In C. w. Harris, M. C. Alkin, & W. J. Popham (Eds.), Problems in criterion-referenced measurement(CSE Monograph Series in Evaluation, No.3, pp. 139-158). Los Angeles: Center for the Study of Evaluation, University of California. Novick, M. R., Lewis, C., & Jackson, P. H. (1973). The estimation of proportions in m groups. Psychometrika, 38, 19-46. Peng, C.-Y. J., & Subkoviak, M. J. (1980). A note on Huynh`s nomal approximation procedure for estimating criterion-referenced reliability. Journal of Educational Measurement, 10(2), 359-368. Plake, B. S., Melican, G. J., & Mills, c. N. (1991). Factore influencing intrajudge consistency during standard-setting. Educational Measurement: Issues and practice, 10(2), 15-16,22. Plake, B. S., & Kane, M. T. (1991). Comparison of method for combining the minimum passing levels for individual item into a passing. Journal of Educational Measurement, 28, 249-256. Popham, W. K., & Husek, T. R. (1969). Implications of criterion- referenced measurement. Journal of Educational Measurement,6~, 1-9. Popham, W. J. (1978). As always, provocative. Journal of Educational Measurement, 15, 297-300. Popham,W.J.(1981). Modern educational measurement. Prentic-hall. Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests. Chicago: The Oniversity of Chicago Press (Or iginal edition was published in 1960). Reid, J. B. (1991). Training judges to generate standard-setting data. Educational Measurement: Issues and practice, 10(2), 11-14. Rowley, G. L.(1982). Historical antecedents of the standard-setting debate: An inside account of the minimal-beardedness controversy. Journal of Educational Measurement, 19,87-95. hannon, G. A., & Cliver, B. A. (1987). An application of item response theory in the compariaon of four conventional item discrimination indices for criterion-referenced tests. Journal of Educational Measurement, 24, 347-356. aunders, J. C., Ryan, J.P., & Huynh, H. (1981). A comparison of two approaches to setting passing scores based on the nedelsky procedure. Applied Psychological Measurement, 5, 209-217 • . 1epard, L. (1980). Technical issures in minimum competence testing. In D. C. Berlinger(Ed.), Review of research In education (Vol. 8). Itasca, Illinois: F.E. Peacock. lepard, L. A. (1984). setting performance standards. In R. A. Berk (Ed), A guide to criterion-referenced test construction (pp.169-198). Baltimore, MD: Johns Hopkins University Press. Skakun, E. N., & Kling, S. (1980). Comparablity of methods for setting standards. Journal of Educational Measurement, 17, 229-235. Smith, R. L., & Smith, J. K. (1988). Di££erential use of item in£ormation by judges ueing Angoff and Nedelsky procedures. Jorn::nal of Educational Measurement, 25,259-285. Subkoviak, M. J. (1976). Estimating reliability from a single administraion of a criterion-referenced test. Journal of Educational Measurement, 13/265-276. Subkoviak, M. J.(1978). Empirical investigation of procedures for estimating reliability for mastery tests. Journal of Educational Measurement, 15, 111-115. Subkoiak, M. J. (1980). Decision-consistency appoaches. In R. A. Berk, (Ed.), criterion-referenced Measurement:The state of the art(pp . 129-185) . Baltimore, Md . : Johns University Press. ubkoviak, M. J. (1988). A practitioner`s guide to computation and interpretation of reliability indices for mastery tests. Journal of Educational Measurement, 25, 47-55. waminathan, H., Hambleton, R. K., & Algina, J. (1975). A Bayesian Decision-theoretic procedure for use with criterion-referenced tests. Journal of Educational Measurement, 12, 87-98. hissen, D. & steinberg, L.(1986). A taxonommy of item response models. Psychmetrika, 51, 567-577. an der Linden, W. J.(1978). Forgetting, guesslng, and mastery: The Macready and Dayton models revisited and compared with a latent trait approach. Journal of Educational Statistics, 3, 305-317. an der Linden, W. J. (1981). A latent trait look at pretest-posttest validation of criterion-referenced test items. Review of Educational Research, 51, 379-402. In der Linden, W. J. (1982). A latent trait method for determining intermining intra judge inconsistency in the Angoff and Nedelsky techniques of standard setting. Journal of Educational Measurement, 19, 295-308. an der Linden, W. J. (1984). Some thoughts on the use of decision theory to set cutoff scores: Comment on de Gruijter and Hambleton. Applied Psychological Measurement, 8, 9-17. rm, T. A. (1978). Aprimer of item response theory. Springfield, VA: National Technical Information Service. lilcox, R. R. (1979). Prediction analysis and the reliability of a mastery test. Educational and Psychological Measurement, 39, 825-839. oehr, D. J., Arthur, W. JR., & Fehrmann, M. L. (1991). An empirical comparlson of cutoff score method for content-related and criterion-related validity settings. Educational and Psychological Mea surement, 51, 1029-1039. reight, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14, 97-166. right, B. D., & Stone, M. H. (1979). Best test design. Chicago: MESA Press. en, W. M. (1987). A comparison of the efficiency and accuracy of BILOG and LOGIST. Psychometrika, 52, 275-291. _eky, M. J., & Livingston, S. A. (1977). Manual for setting standards on the basic skills assessment tests. Princeton, NJ: Educational testing service . zh_TW