學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 通過分數之標準設定的研究
作者 鄭明長
ZHENG, MING-CHANG
貢獻者 余民寧
YU, MIN-NING
鄭明長
ZHENG, MING-CHANG
關鍵詞 通過分數
標準設定
分類
教育
日期 1993
1992
上傳時間 2-May-2016 15:27:12 (UTC+8)
摘要 本研究之目的有二:(一)探討基於試題反應理論的設定通過分數之標準的可能方法
參考文獻 余民寧(民80 )試題反應理論的介紹(一):測驗理論的發展趨勢
     。 研習資訊, 8 卷( 6 期) , 13-18 頁。
     余民寧(民81a )試題反應理論的介紹(二):基本概念與假設。
     研習資訊, 9 卷( 1 期) , 5-9 頁。
     余民寧(民81b )試題反應理論的介紹(三):試題反應模式及其
     特性。研習資訊, 9 卷( 2 期) , 6-10 頁。
     余民寧(民81c )測驗理論的發展趨勢。政治大學心理研究所主辦:
     心理測驗之學術及實務研討會論文。
     余民寧(民81d )試題反應理論的介紹(六):能力量尺。 研習資
     訊, 9 卷( 5 期) , 8-12 頁。
     余民寧(民81e )試題反應理論的介紹(七):訊息函數。 研習資
     訊, 9 卷( 6 期) , 5 - 9 頁。
     吳裕益(民75 )標準參照測驗通過分數設定方法之研究。政大教研
     所博士論文,未出版。
     林惠芬(民82 )通過分數設定方法在護理人員按竅筆試測驗之研究
     。測驗年刊, 40 輯, 253-262 頁。
     許擇基、劉長萱(民81) 試題作答理論簡介。 臺北:中國行為科學社。
     郭生玉(民74) 心理與教育測驗。 臺北:精華。
     陳英豪、吳裕益(民75 )新舊測驗理論之比較及其應用。 臺南師專
     學報,19 期~ 253-290 頁。
     
     
     
     Andrew, B. J. & Hecht, J. (1976). A preliminary
     investigation of two procedures for setting examination
     standards. Educational and Psychological Measurement,
     36,45-50.
     Angoff, W. H. (1971). Scales, norms, and equivalent scorea.
     In R. L. Thorndike (Ed.), Educational Measurement(pp.508
     -600). Washington, D. C.: American Council on
     Education.
     Beaton, A. E., & Allen, N. L.
     through scale anchoring.
     Statistics, !2, 191-204.
     (1992). Interpreting scales
     Journal of Educational
     Behuniak, P. JR., Archambault, F. X., & Gable, R. K. (1982).
     Angoff and Nedelsky standard setti ng procedures:
     implications for the validity of Pr oficiency test score
     interpretation. Educational and Psychological
     measurement, 42, 247-255.
     Berk, R. A. (1986). A consumer`s guide to setting
     performance stsndards on criterion-referenced tests.
     Review of Educational Research, 56(1), 137-172.
     Berk, R. A. (1976). Determination of optiomal cutting scores
     In criterion-referenced measurement.
     Experimental Education, 45, 4-9.
     Journal of
     Beuk, C. H. (1984). A method for reaching a compromlse
     between absolute and relative standards in examinations.
     Journal of Educational Measurement, 21,147-152.
     Birnbaum, A. (1968). Estimation of an ability. In F. M. Lord
     and M. R. Novick, Statistical theories of mental test
     scores (chapters 20). Reading, MA: Addison-Wesley.
     Block, J. H. (1971). Critertion-referced measurements:
     Potential. Shool Review,69,289-298.
     Block, J. N. (1972). Student learning and the setting of
     mastery performance standards. Educational Horizons, 50,
     183-190.
     Block, J. H. (1978). Standards and criteria: A respose.
     Journal of Education Measurement, 15, 291-295.
     Brennan, R. L., & Locb`JQod, R. E. (1980). A comparlson of
     the Nedelsky and Angoff cutting score procedures using
     Generalizability theory. Applied psychological
     measurement, 4, 219-240.
     Burton, N. W. (1978). Societal standards.Journal of
     Educational Measurement, 15,263-271.
     Cascio, W. F., Alexander, R. A., & Barrett, G. V. (1988).
     Setting cutoff scores: Legal, psychometric, and
     professional lssues and guidelines. Personnel
     Psychology, 41, 1-24.
     Crocker, L., & Algina, J. (1986). Introduction to classical
     and modern test theory. New York: Bolt, Rinehart &
     Winston.
     Cross, L. B., Impara, J. C., Frary, R. B., & Jaeger, R. M.
     (1984). A comparison of three methods for establishing
     minimum standards on the national teacher examinationa.
     Journal of Education Measurement, 21, 113-129.
     Davis, F. B., Diamond, J. J. (1974). The preparation of
     criterion-referenced tests. In C. W. Barris, M. C.
     Alkin, & W. J. Popham. (Eds.), Problems ln criterion
     referenced measurement. Los &ngeles: UCLA Graduate
     school of Education, Center for the study of Evaluation.
     de Gruijter, D. N. M., & Bambleton, R. K. (1984). On
     problems encountered using decision theory to set cutoff
     scores. Applied Psychological Measurement, 8, 1- 8.
     Ebel, R. L.(1971). Critertion-referced measurements:
     Limitation. Shool Review,69,282-288.
     Ebe1, R. L. (1972). Essentials of Educational Measurement.
     Englewood. Cli££s, N. J.:Prentice-Hall.
     Ebel, R. L. (1978). The case for minimum competency testing.
     phi Delta Kappan, April, 546-549.
     Ebel, R. L. (1979). Essentials of Educational Mensurement
     (3rd ed.). Englewood Cli£fs, NJ: Prentice-flaIl.
     Emrick, J. A. (1971). An evaluation model for mastery
     testing. Journal of Educational Measurement, ~(4),
     321-326.
     Gagn`e, R. M. (1985). The conditions of Learning and theory
     of instruction. New York: Holt, Rinechart & Winston.
     Garcia-Quintana, R. A., & Mappus, L. L. (1980). Using
     norm-referenced data to set standards for a minimum
     competency program III the stats of South Carolina:
     Aieasibility study. Educational Evaluation and Policy
     Analysis,~, 47-52.
     Glass, G. V. (1978). Standards and criteria. Journal of
     Educational Measurement, 15(4), 237-261.
     Glaser, R. (1963). Instructional technology and the
     measurement of learning outcomes . American Psychologist,
     18, 519-521.
     Glaser, R., & Klaus, D. J. (1962). Proficiency measurement:
     Assessing human performance. In R. M. Gagne` (Ed),
     Psychological Principles in Systems Development(pp.419-
     474). New York: Holt, Rinhart and Winston.
     Glaser, R. & Nitko,A J.(1971).Measurement in learning and
     instruction. In R. L. Thorndike (Ed.), Educational
     measurement(pp.625-670). Washington: American Council on
     Education,
     Guion, R. M., & Ironson, G. H. (1983). Latent trait theory
     for organizational research. Organizational Behavior and
     Human Performance, 31,54-87.
     Haladyna, T. M., & Roid, G. H. (1983). A cornparlBon of two
     approaches to criterion-referenced test construction.
     Journal of Educational Measurement, 20,271-281.
     Halpin, G., Sigmon, G.,
     competency standards
     & Halpin,
     set by
     G. (1983). Minimum
     three judgmental
     procedures:implications for validity. Educational and
     psychological measurement , 43,185- 196.
     Hambleton, R. K. (1978). On the use of cut- off scores with
     criterion- referenced tests in instructional settings.
     Journal of Educational Measurement, 15(4), 277-290.
     Hambleton, R. K. (1979) Latent trait models and their
     applications. In R. T. Guest, (Ed), Methodological
     developments. Washington: Jossey-Bass.
     Hambleton, R. K.(1980). Test score validity and
     standars-setting methods. In R. A. Berk, (Ed.),
     Criterion-referenced Measurement: The state of the
     art(pp.80-128). Baltimore, MD:John Hopkins University
     Press.
     Hambleton, R. K. (1983). Application of item response models
     to criterion referenced assessment. Applied
     psychological Measurement, 7, 33-44.
     Hambleton, R. K. (1989). Principles and selected
     applications of item response theory. In R. L. Linn
     (ED.), Educational measurment (3rd ed., pp. 147-200).
     New York: Macmillan.
     Hambleton, R. K. (1990). Criterion referenced-testing
     methods and practices. In T. B. Gutkin & C. R.Reynolds
     (Eds.), The handbook of school psychology (pp. 388-415).
     New Jork:John Wiley & Sons.
     Hambleton, R. K.,Algina, J., & coulson, D. S. (1978).
     criterion- referenced testing and measurement:A review
     of technical issues and developments. Review of
     Educational Research, 48, 1- 47.
     Hambleton, R. K., & Cook, L. L. (1977). Latent trait models
     and their use in the analysis of educational test data.
     Journal of Educational Measurement, 14,75 -96.
     
     Hambleton, R. K., & de
     Application of
     criterion-referenced
     Gruijter, D. N.
     item
     test
     response
     selection.
     Educational Measurement, 20, 355-367.
     M. (1983).
     model to
     Journal of
     Hambleton, R. K., & Eignor, D. R. (1978). Guidelines for
     evaluating criterion-referenced tests and test
     manuals. Journal of Educational Measurement, 15,321-327.
     Hambleton, R. K., & Eignor, D. R. (1980). Competency test
     development ,validation,and standard setting. In R. M.
     Jaeger & C. K. Tittle (Eds.), Minimum Competency
     Achievement Testing: Motives, models, measures, and
     consequences(pp.367-396). Berkeley, CA.: McCutchan.
     Hambleton, R. K., Mills, C. N. & Simon, R. (1983).
     Determining the lengths for criterion- referenced tests.
     Journal of Educational Measurement, 20, 27-38.
     Hambleton, R. K., & Novick, M. R. (1973). Toward an
     integration of theory and method for
     criterion-referenced tests. Journal of Education
     Measurement, 10,159-170.
     Hambleton, R. K., swaminathan, H., Algin a , J., & Coulson, D.
     S. (1978). Criterion-referenced testing and measurement:
     A review of technical issues and developments. Review of
     Educational research, 48, 1-47.
     Hambleton, R. K., & Swaminathan, H. (1985). Item response
     theory : Principles and applications. Boston, Ma:
     Kluwer-Nijhoff.
     Hambleton, R. K., Swaminthan, H. & Rogers, H. J. (1991).
     Fundamentals of item response theory. Newburry Park,
     CA: SAGE.
     Harasym, P. H. (1981). A comparison of the Nedelsky and
     modified Angoff standard-setting procedure on evaluation
     outcome. Educational and Psycholoical Measurement,
     41,725-734.
     Harris, C. W., (1972).An interpretation of Livingston`s
     reliability coefficient for criterion-referenceed tests.
     Journal of Educational Measurement, 9, 27-29.
     Harris, D.J., & Subkoviak, M. J. (1986). Item analysis: A
     short-cut statisitic for mastery tests. Educational and
     Psychological Measurement, 46, 494-507.
     Hu li l1; L. L., Drasgm`J, F., & Parsons, C. K. (1983). Item
     response theory: Application to psychological
     measurement. Homewood, IL: Dow Jones- Irwin.
     Huynh, H. (1976). On the reliability of decisions In
     domain-referenced testing. Journal of Educational
     Measurement, 13, 253-264.
     Huynh, E. (1978). Reliability of mutiple classifications.
     Psychmetrika, 45, 317-325.
     Euynh, E. (1985). Assessing Mastery of basic skills through
     summative testing. In D. V. Levine, (Ed), Improving
     student achievement through mastery learning programs.
     San Francisco, Califoenia: Jossey-Bass.
     Euynh, E., & Castel, J. (1985). A comparslon of the mllllmax
     and Rasch approaches to set simultaneous- passlllg scores
     for subtests. Journal of Education Statistics, 10,
     334-344.
     Jaeger, R. M. (1991). Selection of judges for
     standard-setting. Educational Measurement: Issues and
     Practice, 10(2), 3-6.
     Jaeger, R. M. (1989). Certification of student competence.
     In R. L. Linn (ED.), Educational rneasurment (3rd ed.,
     pp. 147-200). New York: Macmillan.
     Jaeger, R. M. (1982). An iterative structured judgment
     process for establishing standards on competency tests:
     Theory and application Educational Evaluation and
     Policy Analysis, 4, 461-476.
     Kane, M. T. (1987). On the use of IRT models with
     judgemental standard setting procedures. Journal of
     Educational Measurement, 24, 333-345.
     
     Koffler, S. L. (1980). A comparlson of approaches for
     setting proficiency standards. Journal of Educational
     measurement, li, 167-178.
     Kriewal, T. E. (1972). Aspects and applications of
     criterion-referenced tests. I.llinois school research,
     9, 5-18.
     Levin, H. M. (1978). Educational performance standards:lmage
     or substance? Journal of Educational Measurement, 15,
     309-319.
     Livingston, S. A. (1975). A utility-based approach to the
     evaluation of pass/fall testing decision procedures (
     Rep. No. Copa-75-01). Princeton, NJ: Center for
     Occupational and Professional Assessment, Educational
     Testing Sevice.
     Livingston, S. A. (1980). Choosing minimum pessing score by
     stochastic approximation techniques. Education and
     Psychological Measurement, 40, 859-873.
     Livingston, S. A., & Zieky, M. J. ("1982). Manual for setting
     standards on the basic skills assessment tests.
     Princeton, N. J.: Educational Testing Service.
     Lord, F. M. (1980).Applications of item response theory to
     practical test problem. Hillsdale, NJ: Lawrence Erlbaum
     associates.
     
     
     
     (P86未key)
     
     Mislevy, R. J., & Bock, R. D. (1983). BILOG: Item analysis
     and test with binary logistic models. Mooresville IN:
     Scientific Software , Inc.
     Mislevy, R. J., Johnson, E. G., & Muraki, E. (1992). Sclaing
     procedures in N~. Journal of Education statistics, 17,
     131-154.
     Mislevy, R. J. & Stocking, M; L. (1989). A consumer`s guide
     to LOGIST and BILOG. Applied Psychological Measurement,
     13, 57-75.
     Nedelsky, L. (1954). Absolute grading standards for
     objective tests. Educational and Psychological
     Measurement, 14, 3-19 .
     Norcini, J. J., Lipner, R. S., Langdon, L. 0., & Strecker,
     C. A. (1987). A comparlSon of three variations on a
     standard-setting method. Journal of Educational
     Measurement, 24, 56-64.
     Novick , M. R. , & Lewis ; C. (1974)= Prescribing test length
     for estimation criterion-referenced measurement. In C.
     w. Harris, M. C. Alkin, & W. J. Popham (Eds.), Problems
     in criterion-referenced measurement(CSE Monograph Series
     in Evaluation, No.3, pp. 139-158). Los Angeles: Center
     for the Study of Evaluation, University of California.
     Novick, M. R., Lewis, C., & Jackson, P. H. (1973). The
     estimation of proportions in m groups. Psychometrika,
     38, 19-46.
     Peng, C.-Y. J., & Subkoviak, M. J. (1980). A note on Huynh`s
     nomal approximation procedure for estimating
     criterion-referenced reliability. Journal of
     Educational Measurement, 10(2), 359-368.
     Plake, B. S., Melican, G. J., & Mills, c. N. (1991). Factore
     influencing intrajudge consistency during
     standard-setting. Educational Measurement: Issues and
     practice, 10(2), 15-16,22.
     Plake, B. S., & Kane, M. T. (1991). Comparison of method for
     combining the minimum passing levels for individual item
     into a passing. Journal of Educational Measurement, 28,
     249-256.
     Popham, W. K., & Husek, T. R. (1969). Implications of
     criterion- referenced measurement. Journal of
     Educational Measurement,6~, 1-9.
     Popham, W. J. (1978). As always, provocative. Journal of
     Educational Measurement, 15, 297-300.
     Popham,W.J.(1981). Modern educational measurement.
     Prentic-hall.
     Rasch, G. (1980). Probabilistic models for some intelligence
     and attainment tests. Chicago: The Oniversity of Chicago
     Press (Or iginal edition was published in 1960).
     Reid, J. B. (1991). Training judges to generate
     standard-setting data. Educational Measurement: Issues
     and practice, 10(2), 11-14.
     Rowley, G. L.(1982). Historical antecedents of the
     standard-setting debate: An inside account of the
     minimal-beardedness controversy. Journal of Educational
     Measurement, 19,87-95.
     hannon, G. A., & Cliver, B. A. (1987). An application of
     item response theory in the compariaon of four
     conventional item discrimination indices for
     criterion-referenced tests. Journal of Educational
     Measurement, 24, 347-356.
     aunders, J. C., Ryan, J.P., & Huynh, H. (1981). A
     comparison of two approaches to setting passing scores
     based on the nedelsky procedure. Applied Psychological
     Measurement, 5, 209-217 •
     . 1epard, L. (1980). Technical issures in minimum competence
     testing. In D. C. Berlinger(Ed.), Review of research In
     education (Vol. 8). Itasca, Illinois: F.E. Peacock.
     lepard, L. A. (1984). setting performance standards. In R.
     A. Berk (Ed), A guide to criterion-referenced test
     construction (pp.169-198). Baltimore, MD: Johns Hopkins
     University Press.
     Skakun, E. N., & Kling, S. (1980). Comparablity of methods
     for setting standards. Journal of Educational
     Measurement, 17, 229-235.
     Smith, R. L., & Smith, J. K. (1988). Di££erential use of
     item in£ormation by judges ueing Angoff and Nedelsky
     procedures. Jorn::nal of Educational Measurement,
     25,259-285.
     Subkoviak, M. J. (1976). Estimating reliability from a
     single administraion of a criterion-referenced test.
     Journal of Educational Measurement, 13/265-276.
     Subkoviak, M. J.(1978). Empirical investigation of
     procedures for estimating reliability for mastery tests.
     Journal of Educational Measurement, 15, 111-115.
     Subkoiak, M. J. (1980). Decision-consistency appoaches. In
     R. A. Berk, (Ed.), criterion-referenced Measurement:The
     state of the art(pp . 129-185) . Baltimore, Md . : Johns
     University Press.
     ubkoviak, M. J. (1988). A practitioner`s guide to
     computation and interpretation of reliability indices
     for mastery tests. Journal of Educational Measurement,
     25, 47-55.
     waminathan, H., Hambleton, R. K., & Algina, J. (1975). A
     Bayesian Decision-theoretic procedure for use with
     criterion-referenced tests. Journal of Educational
     Measurement, 12, 87-98.
     hissen, D. & steinberg, L.(1986). A taxonommy of item
     response models. Psychmetrika, 51, 567-577.
     an der Linden, W. J.(1978). Forgetting, guesslng, and
     mastery: The Macready and Dayton models revisited and
     compared with a latent trait approach. Journal of
     Educational Statistics, 3, 305-317.
     an der Linden, W. J. (1981). A latent trait look at
     pretest-posttest validation of criterion-referenced
     test items. Review of Educational Research, 51, 379-402.
     In der Linden, W. J. (1982). A latent trait method for
     determining intermining intra judge inconsistency in
     the Angoff and Nedelsky techniques of standard setting.
     Journal of Educational Measurement, 19, 295-308.
     an der Linden, W. J. (1984). Some thoughts on the use of
     decision theory to set cutoff scores: Comment on de
     Gruijter and Hambleton. Applied Psychological
     Measurement, 8, 9-17.
     rm, T. A. (1978). Aprimer of item response theory.
     Springfield, VA: National Technical Information Service.
     lilcox, R. R. (1979). Prediction analysis and the
     reliability of a mastery test. Educational and
     Psychological Measurement, 39, 825-839.
     oehr, D. J., Arthur, W. JR., & Fehrmann, M. L. (1991). An
     empirical comparlson of cutoff score method for
     content-related and criterion-related validity settings.
     Educational and Psychological Mea surement, 51,
     1029-1039.
     reight, B. D. (1977). Solving measurement problems with the
     Rasch model. Journal of Educational Measurement, 14,
     97-166.
     right, B. D., & Stone, M. H. (1979). Best test design.
     Chicago: MESA Press.
     en, W. M. (1987). A comparison of the efficiency and
     accuracy of BILOG and LOGIST. Psychometrika, 52,
     275-291.
     _eky, M. J., & Livingston, S. A. (1977). Manual for setting
     standards on the basic skills assessment tests.
     Princeton, NJ: Educational testing service .
描述 碩士
國立政治大學
教育學系
資料來源 http://thesis.lib.nccu.edu.tw/record/#B2002004311
資料類型 thesis
dc.contributor.advisor 余民寧zh_TW
dc.contributor.advisor YU, MIN-NINGen_US
dc.contributor.author (Authors) 鄭明長zh_TW
dc.contributor.author (Authors) ZHENG, MING-CHANGen_US
dc.creator (作者) 鄭明長zh_TW
dc.creator (作者) ZHENG, MING-CHANGen_US
dc.date (日期) 1993en_US
dc.date (日期) 1992en_US
dc.date.accessioned 2-May-2016 15:27:12 (UTC+8)-
dc.date.available 2-May-2016 15:27:12 (UTC+8)-
dc.date.issued (上傳時間) 2-May-2016 15:27:12 (UTC+8)-
dc.identifier (Other Identifiers) B2002004311en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/89462-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 教育學系zh_TW
dc.description.abstract (摘要) 本研究之目的有二:(一)探討基於試題反應理論的設定通過分數之標準的可能方法zh_TW
dc.description.tableofcontents 第一章 緒 論. . . . . .  . . . . . .  . . . . . . . . .1
     
     第一節 研究動機與目的. . . . . . . . . . . . . . . . .1
     第二節 研究問題. . . . . .  . . . . . .  . . . . . . 8
     第三節 研究範圍與限制. . . . . .  . . . . . . . . . . 9
     第四節 重要名詞定義. . . . . .  . . . . . . . . . . 10
     
     第二章 文獻探討. . . . . .  . . . . . .  . . . . . . .12
     
     第一節 通過分數設訂定相關概念. . . . . . . . . . . . . 12
     第二節 各種標準設定方法之探討. . . . . .  . . . . . . 19
     第三節 通過分數之標準設定方法的比較. . . . . . . . . . 30
     第四節 試題反映理論及其在設定通過分數上的應用. . . . . . 35
     第五節 分類決定的信度評估. . . . . . . . . . . . . . .44
     
     第三章 研究方法. . . . . .  . . . . . .  . . . . . . .48
     
     第一節 研究假設. . . . . .  . . . . . . . . . . . .48
     第二節 研究對象與抽樣. . . . . .  . . . . . . . . . 49
     第三節 研究工具與研究資料. . . . . .  . . . . . . . 50
     第四節 資料處理與統計分析. . . . . .  . . . . . . . 52
     
     第四章 研究結果與討論. . . . . .  . . . . . . . . . . 53
     
     第一節 各種設定方法的初步分類結果. . . . . . . . . . .53
     第二節 各種設定方法間的比較. . . . . .  . . . . . . .58
     第三節 綜合討論. . . . . .  . . . . . .  . . . . .68
     
     第五章 結論與建議. . . . . .  . . . . . .  . . . . . .71
     
     第一節 結論. . . . . .  . . . . . .  . . . . . . 71
     第二節 建議. . . . . .  . . . . . .  . . . . . . 72
     
     參考文獻. . . . . .  . . . . . .  . . . . . . . . . .75
     
     附錄一 標準測試題本. . . . . .  . . . . . . . . . . . 93
     附錄二 自編測驗. . . . . .  . . . . . .  . . . . . . 99
     附錄三 教師判斷表. . . . . .  . . . . . .  . . . . . .103
     附錄四 兩份測驗之因素分析結果. . . . . .  . . . . . . .104
zh_TW
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#B2002004311en_US
dc.subject (關鍵詞) 通過分數zh_TW
dc.subject (關鍵詞) 標準設定zh_TW
dc.subject (關鍵詞) 分類zh_TW
dc.subject (關鍵詞) 教育zh_TW
dc.title (題名) 通過分數之標準設定的研究zh_TW
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) 余民寧(民80 )試題反應理論的介紹(一):測驗理論的發展趨勢
     。 研習資訊, 8 卷( 6 期) , 13-18 頁。
     余民寧(民81a )試題反應理論的介紹(二):基本概念與假設。
     研習資訊, 9 卷( 1 期) , 5-9 頁。
     余民寧(民81b )試題反應理論的介紹(三):試題反應模式及其
     特性。研習資訊, 9 卷( 2 期) , 6-10 頁。
     余民寧(民81c )測驗理論的發展趨勢。政治大學心理研究所主辦:
     心理測驗之學術及實務研討會論文。
     余民寧(民81d )試題反應理論的介紹(六):能力量尺。 研習資
     訊, 9 卷( 5 期) , 8-12 頁。
     余民寧(民81e )試題反應理論的介紹(七):訊息函數。 研習資
     訊, 9 卷( 6 期) , 5 - 9 頁。
     吳裕益(民75 )標準參照測驗通過分數設定方法之研究。政大教研
     所博士論文,未出版。
     林惠芬(民82 )通過分數設定方法在護理人員按竅筆試測驗之研究
     。測驗年刊, 40 輯, 253-262 頁。
     許擇基、劉長萱(民81) 試題作答理論簡介。 臺北:中國行為科學社。
     郭生玉(民74) 心理與教育測驗。 臺北:精華。
     陳英豪、吳裕益(民75 )新舊測驗理論之比較及其應用。 臺南師專
     學報,19 期~ 253-290 頁。
     
     
     
     Andrew, B. J. & Hecht, J. (1976). A preliminary
     investigation of two procedures for setting examination
     standards. Educational and Psychological Measurement,
     36,45-50.
     Angoff, W. H. (1971). Scales, norms, and equivalent scorea.
     In R. L. Thorndike (Ed.), Educational Measurement(pp.508
     -600). Washington, D. C.: American Council on
     Education.
     Beaton, A. E., & Allen, N. L.
     through scale anchoring.
     Statistics, !2, 191-204.
     (1992). Interpreting scales
     Journal of Educational
     Behuniak, P. JR., Archambault, F. X., & Gable, R. K. (1982).
     Angoff and Nedelsky standard setti ng procedures:
     implications for the validity of Pr oficiency test score
     interpretation. Educational and Psychological
     measurement, 42, 247-255.
     Berk, R. A. (1986). A consumer`s guide to setting
     performance stsndards on criterion-referenced tests.
     Review of Educational Research, 56(1), 137-172.
     Berk, R. A. (1976). Determination of optiomal cutting scores
     In criterion-referenced measurement.
     Experimental Education, 45, 4-9.
     Journal of
     Beuk, C. H. (1984). A method for reaching a compromlse
     between absolute and relative standards in examinations.
     Journal of Educational Measurement, 21,147-152.
     Birnbaum, A. (1968). Estimation of an ability. In F. M. Lord
     and M. R. Novick, Statistical theories of mental test
     scores (chapters 20). Reading, MA: Addison-Wesley.
     Block, J. H. (1971). Critertion-referced measurements:
     Potential. Shool Review,69,289-298.
     Block, J. N. (1972). Student learning and the setting of
     mastery performance standards. Educational Horizons, 50,
     183-190.
     Block, J. H. (1978). Standards and criteria: A respose.
     Journal of Education Measurement, 15, 291-295.
     Brennan, R. L., & Locb`JQod, R. E. (1980). A comparlson of
     the Nedelsky and Angoff cutting score procedures using
     Generalizability theory. Applied psychological
     measurement, 4, 219-240.
     Burton, N. W. (1978). Societal standards.Journal of
     Educational Measurement, 15,263-271.
     Cascio, W. F., Alexander, R. A., & Barrett, G. V. (1988).
     Setting cutoff scores: Legal, psychometric, and
     professional lssues and guidelines. Personnel
     Psychology, 41, 1-24.
     Crocker, L., & Algina, J. (1986). Introduction to classical
     and modern test theory. New York: Bolt, Rinehart &
     Winston.
     Cross, L. B., Impara, J. C., Frary, R. B., & Jaeger, R. M.
     (1984). A comparison of three methods for establishing
     minimum standards on the national teacher examinationa.
     Journal of Education Measurement, 21, 113-129.
     Davis, F. B., Diamond, J. J. (1974). The preparation of
     criterion-referenced tests. In C. W. Barris, M. C.
     Alkin, & W. J. Popham. (Eds.), Problems ln criterion
     referenced measurement. Los &ngeles: UCLA Graduate
     school of Education, Center for the study of Evaluation.
     de Gruijter, D. N. M., & Bambleton, R. K. (1984). On
     problems encountered using decision theory to set cutoff
     scores. Applied Psychological Measurement, 8, 1- 8.
     Ebel, R. L.(1971). Critertion-referced measurements:
     Limitation. Shool Review,69,282-288.
     Ebe1, R. L. (1972). Essentials of Educational Measurement.
     Englewood. Cli££s, N. J.:Prentice-Hall.
     Ebel, R. L. (1978). The case for minimum competency testing.
     phi Delta Kappan, April, 546-549.
     Ebel, R. L. (1979). Essentials of Educational Mensurement
     (3rd ed.). Englewood Cli£fs, NJ: Prentice-flaIl.
     Emrick, J. A. (1971). An evaluation model for mastery
     testing. Journal of Educational Measurement, ~(4),
     321-326.
     Gagn`e, R. M. (1985). The conditions of Learning and theory
     of instruction. New York: Holt, Rinechart & Winston.
     Garcia-Quintana, R. A., & Mappus, L. L. (1980). Using
     norm-referenced data to set standards for a minimum
     competency program III the stats of South Carolina:
     Aieasibility study. Educational Evaluation and Policy
     Analysis,~, 47-52.
     Glass, G. V. (1978). Standards and criteria. Journal of
     Educational Measurement, 15(4), 237-261.
     Glaser, R. (1963). Instructional technology and the
     measurement of learning outcomes . American Psychologist,
     18, 519-521.
     Glaser, R., & Klaus, D. J. (1962). Proficiency measurement:
     Assessing human performance. In R. M. Gagne` (Ed),
     Psychological Principles in Systems Development(pp.419-
     474). New York: Holt, Rinhart and Winston.
     Glaser, R. & Nitko,A J.(1971).Measurement in learning and
     instruction. In R. L. Thorndike (Ed.), Educational
     measurement(pp.625-670). Washington: American Council on
     Education,
     Guion, R. M., & Ironson, G. H. (1983). Latent trait theory
     for organizational research. Organizational Behavior and
     Human Performance, 31,54-87.
     Haladyna, T. M., & Roid, G. H. (1983). A cornparlBon of two
     approaches to criterion-referenced test construction.
     Journal of Educational Measurement, 20,271-281.
     Halpin, G., Sigmon, G.,
     competency standards
     & Halpin,
     set by
     G. (1983). Minimum
     three judgmental
     procedures:implications for validity. Educational and
     psychological measurement , 43,185- 196.
     Hambleton, R. K. (1978). On the use of cut- off scores with
     criterion- referenced tests in instructional settings.
     Journal of Educational Measurement, 15(4), 277-290.
     Hambleton, R. K. (1979) Latent trait models and their
     applications. In R. T. Guest, (Ed), Methodological
     developments. Washington: Jossey-Bass.
     Hambleton, R. K.(1980). Test score validity and
     standars-setting methods. In R. A. Berk, (Ed.),
     Criterion-referenced Measurement: The state of the
     art(pp.80-128). Baltimore, MD:John Hopkins University
     Press.
     Hambleton, R. K. (1983). Application of item response models
     to criterion referenced assessment. Applied
     psychological Measurement, 7, 33-44.
     Hambleton, R. K. (1989). Principles and selected
     applications of item response theory. In R. L. Linn
     (ED.), Educational measurment (3rd ed., pp. 147-200).
     New York: Macmillan.
     Hambleton, R. K. (1990). Criterion referenced-testing
     methods and practices. In T. B. Gutkin & C. R.Reynolds
     (Eds.), The handbook of school psychology (pp. 388-415).
     New Jork:John Wiley & Sons.
     Hambleton, R. K.,Algina, J., & coulson, D. S. (1978).
     criterion- referenced testing and measurement:A review
     of technical issues and developments. Review of
     Educational Research, 48, 1- 47.
     Hambleton, R. K., & Cook, L. L. (1977). Latent trait models
     and their use in the analysis of educational test data.
     Journal of Educational Measurement, 14,75 -96.
     
     Hambleton, R. K., & de
     Application of
     criterion-referenced
     Gruijter, D. N.
     item
     test
     response
     selection.
     Educational Measurement, 20, 355-367.
     M. (1983).
     model to
     Journal of
     Hambleton, R. K., & Eignor, D. R. (1978). Guidelines for
     evaluating criterion-referenced tests and test
     manuals. Journal of Educational Measurement, 15,321-327.
     Hambleton, R. K., & Eignor, D. R. (1980). Competency test
     development ,validation,and standard setting. In R. M.
     Jaeger & C. K. Tittle (Eds.), Minimum Competency
     Achievement Testing: Motives, models, measures, and
     consequences(pp.367-396). Berkeley, CA.: McCutchan.
     Hambleton, R. K., Mills, C. N. & Simon, R. (1983).
     Determining the lengths for criterion- referenced tests.
     Journal of Educational Measurement, 20, 27-38.
     Hambleton, R. K., & Novick, M. R. (1973). Toward an
     integration of theory and method for
     criterion-referenced tests. Journal of Education
     Measurement, 10,159-170.
     Hambleton, R. K., swaminathan, H., Algin a , J., & Coulson, D.
     S. (1978). Criterion-referenced testing and measurement:
     A review of technical issues and developments. Review of
     Educational research, 48, 1-47.
     Hambleton, R. K., & Swaminathan, H. (1985). Item response
     theory : Principles and applications. Boston, Ma:
     Kluwer-Nijhoff.
     Hambleton, R. K., Swaminthan, H. & Rogers, H. J. (1991).
     Fundamentals of item response theory. Newburry Park,
     CA: SAGE.
     Harasym, P. H. (1981). A comparison of the Nedelsky and
     modified Angoff standard-setting procedure on evaluation
     outcome. Educational and Psycholoical Measurement,
     41,725-734.
     Harris, C. W., (1972).An interpretation of Livingston`s
     reliability coefficient for criterion-referenceed tests.
     Journal of Educational Measurement, 9, 27-29.
     Harris, D.J., & Subkoviak, M. J. (1986). Item analysis: A
     short-cut statisitic for mastery tests. Educational and
     Psychological Measurement, 46, 494-507.
     Hu li l1; L. L., Drasgm`J, F., & Parsons, C. K. (1983). Item
     response theory: Application to psychological
     measurement. Homewood, IL: Dow Jones- Irwin.
     Huynh, H. (1976). On the reliability of decisions In
     domain-referenced testing. Journal of Educational
     Measurement, 13, 253-264.
     Huynh, E. (1978). Reliability of mutiple classifications.
     Psychmetrika, 45, 317-325.
     Euynh, E. (1985). Assessing Mastery of basic skills through
     summative testing. In D. V. Levine, (Ed), Improving
     student achievement through mastery learning programs.
     San Francisco, Califoenia: Jossey-Bass.
     Euynh, E., & Castel, J. (1985). A comparslon of the mllllmax
     and Rasch approaches to set simultaneous- passlllg scores
     for subtests. Journal of Education Statistics, 10,
     334-344.
     Jaeger, R. M. (1991). Selection of judges for
     standard-setting. Educational Measurement: Issues and
     Practice, 10(2), 3-6.
     Jaeger, R. M. (1989). Certification of student competence.
     In R. L. Linn (ED.), Educational rneasurment (3rd ed.,
     pp. 147-200). New York: Macmillan.
     Jaeger, R. M. (1982). An iterative structured judgment
     process for establishing standards on competency tests:
     Theory and application Educational Evaluation and
     Policy Analysis, 4, 461-476.
     Kane, M. T. (1987). On the use of IRT models with
     judgemental standard setting procedures. Journal of
     Educational Measurement, 24, 333-345.
     
     Koffler, S. L. (1980). A comparlson of approaches for
     setting proficiency standards. Journal of Educational
     measurement, li, 167-178.
     Kriewal, T. E. (1972). Aspects and applications of
     criterion-referenced tests. I.llinois school research,
     9, 5-18.
     Levin, H. M. (1978). Educational performance standards:lmage
     or substance? Journal of Educational Measurement, 15,
     309-319.
     Livingston, S. A. (1975). A utility-based approach to the
     evaluation of pass/fall testing decision procedures (
     Rep. No. Copa-75-01). Princeton, NJ: Center for
     Occupational and Professional Assessment, Educational
     Testing Sevice.
     Livingston, S. A. (1980). Choosing minimum pessing score by
     stochastic approximation techniques. Education and
     Psychological Measurement, 40, 859-873.
     Livingston, S. A., & Zieky, M. J. ("1982). Manual for setting
     standards on the basic skills assessment tests.
     Princeton, N. J.: Educational Testing Service.
     Lord, F. M. (1980).Applications of item response theory to
     practical test problem. Hillsdale, NJ: Lawrence Erlbaum
     associates.
     
     
     
     (P86未key)
     
     Mislevy, R. J., & Bock, R. D. (1983). BILOG: Item analysis
     and test with binary logistic models. Mooresville IN:
     Scientific Software , Inc.
     Mislevy, R. J., Johnson, E. G., & Muraki, E. (1992). Sclaing
     procedures in N~. Journal of Education statistics, 17,
     131-154.
     Mislevy, R. J. & Stocking, M; L. (1989). A consumer`s guide
     to LOGIST and BILOG. Applied Psychological Measurement,
     13, 57-75.
     Nedelsky, L. (1954). Absolute grading standards for
     objective tests. Educational and Psychological
     Measurement, 14, 3-19 .
     Norcini, J. J., Lipner, R. S., Langdon, L. 0., & Strecker,
     C. A. (1987). A comparlSon of three variations on a
     standard-setting method. Journal of Educational
     Measurement, 24, 56-64.
     Novick , M. R. , & Lewis ; C. (1974)= Prescribing test length
     for estimation criterion-referenced measurement. In C.
     w. Harris, M. C. Alkin, & W. J. Popham (Eds.), Problems
     in criterion-referenced measurement(CSE Monograph Series
     in Evaluation, No.3, pp. 139-158). Los Angeles: Center
     for the Study of Evaluation, University of California.
     Novick, M. R., Lewis, C., & Jackson, P. H. (1973). The
     estimation of proportions in m groups. Psychometrika,
     38, 19-46.
     Peng, C.-Y. J., & Subkoviak, M. J. (1980). A note on Huynh`s
     nomal approximation procedure for estimating
     criterion-referenced reliability. Journal of
     Educational Measurement, 10(2), 359-368.
     Plake, B. S., Melican, G. J., & Mills, c. N. (1991). Factore
     influencing intrajudge consistency during
     standard-setting. Educational Measurement: Issues and
     practice, 10(2), 15-16,22.
     Plake, B. S., & Kane, M. T. (1991). Comparison of method for
     combining the minimum passing levels for individual item
     into a passing. Journal of Educational Measurement, 28,
     249-256.
     Popham, W. K., & Husek, T. R. (1969). Implications of
     criterion- referenced measurement. Journal of
     Educational Measurement,6~, 1-9.
     Popham, W. J. (1978). As always, provocative. Journal of
     Educational Measurement, 15, 297-300.
     Popham,W.J.(1981). Modern educational measurement.
     Prentic-hall.
     Rasch, G. (1980). Probabilistic models for some intelligence
     and attainment tests. Chicago: The Oniversity of Chicago
     Press (Or iginal edition was published in 1960).
     Reid, J. B. (1991). Training judges to generate
     standard-setting data. Educational Measurement: Issues
     and practice, 10(2), 11-14.
     Rowley, G. L.(1982). Historical antecedents of the
     standard-setting debate: An inside account of the
     minimal-beardedness controversy. Journal of Educational
     Measurement, 19,87-95.
     hannon, G. A., & Cliver, B. A. (1987). An application of
     item response theory in the compariaon of four
     conventional item discrimination indices for
     criterion-referenced tests. Journal of Educational
     Measurement, 24, 347-356.
     aunders, J. C., Ryan, J.P., & Huynh, H. (1981). A
     comparison of two approaches to setting passing scores
     based on the nedelsky procedure. Applied Psychological
     Measurement, 5, 209-217 •
     . 1epard, L. (1980). Technical issures in minimum competence
     testing. In D. C. Berlinger(Ed.), Review of research In
     education (Vol. 8). Itasca, Illinois: F.E. Peacock.
     lepard, L. A. (1984). setting performance standards. In R.
     A. Berk (Ed), A guide to criterion-referenced test
     construction (pp.169-198). Baltimore, MD: Johns Hopkins
     University Press.
     Skakun, E. N., & Kling, S. (1980). Comparablity of methods
     for setting standards. Journal of Educational
     Measurement, 17, 229-235.
     Smith, R. L., & Smith, J. K. (1988). Di££erential use of
     item in£ormation by judges ueing Angoff and Nedelsky
     procedures. Jorn::nal of Educational Measurement,
     25,259-285.
     Subkoviak, M. J. (1976). Estimating reliability from a
     single administraion of a criterion-referenced test.
     Journal of Educational Measurement, 13/265-276.
     Subkoviak, M. J.(1978). Empirical investigation of
     procedures for estimating reliability for mastery tests.
     Journal of Educational Measurement, 15, 111-115.
     Subkoiak, M. J. (1980). Decision-consistency appoaches. In
     R. A. Berk, (Ed.), criterion-referenced Measurement:The
     state of the art(pp . 129-185) . Baltimore, Md . : Johns
     University Press.
     ubkoviak, M. J. (1988). A practitioner`s guide to
     computation and interpretation of reliability indices
     for mastery tests. Journal of Educational Measurement,
     25, 47-55.
     waminathan, H., Hambleton, R. K., & Algina, J. (1975). A
     Bayesian Decision-theoretic procedure for use with
     criterion-referenced tests. Journal of Educational
     Measurement, 12, 87-98.
     hissen, D. & steinberg, L.(1986). A taxonommy of item
     response models. Psychmetrika, 51, 567-577.
     an der Linden, W. J.(1978). Forgetting, guesslng, and
     mastery: The Macready and Dayton models revisited and
     compared with a latent trait approach. Journal of
     Educational Statistics, 3, 305-317.
     an der Linden, W. J. (1981). A latent trait look at
     pretest-posttest validation of criterion-referenced
     test items. Review of Educational Research, 51, 379-402.
     In der Linden, W. J. (1982). A latent trait method for
     determining intermining intra judge inconsistency in
     the Angoff and Nedelsky techniques of standard setting.
     Journal of Educational Measurement, 19, 295-308.
     an der Linden, W. J. (1984). Some thoughts on the use of
     decision theory to set cutoff scores: Comment on de
     Gruijter and Hambleton. Applied Psychological
     Measurement, 8, 9-17.
     rm, T. A. (1978). Aprimer of item response theory.
     Springfield, VA: National Technical Information Service.
     lilcox, R. R. (1979). Prediction analysis and the
     reliability of a mastery test. Educational and
     Psychological Measurement, 39, 825-839.
     oehr, D. J., Arthur, W. JR., & Fehrmann, M. L. (1991). An
     empirical comparlson of cutoff score method for
     content-related and criterion-related validity settings.
     Educational and Psychological Mea surement, 51,
     1029-1039.
     reight, B. D. (1977). Solving measurement problems with the
     Rasch model. Journal of Educational Measurement, 14,
     97-166.
     right, B. D., & Stone, M. H. (1979). Best test design.
     Chicago: MESA Press.
     en, W. M. (1987). A comparison of the efficiency and
     accuracy of BILOG and LOGIST. Psychometrika, 52,
     275-291.
     _eky, M. J., & Livingston, S. A. (1977). Manual for setting
     standards on the basic skills assessment tests.
     Princeton, NJ: Educational testing service .
zh_TW